uu of 
THE ANNALS 
of 
MATHEMATICAL 


STATISTICS 


(FOUNDED BY H. C. CARVER) 


Tue OFFICIAL JOURNAL OF THE INSTITUTE 
OF MATHEMATICAL STATISTICS 


Contents 


On the Variance of Estimates. G. R. Sera 
On the Theory of Some Non-Parametric Hypotheses. E. L. LEHMANN AND 


Estimation of the Parameters of a Single Equation in a Cumplete System of 
Stochastic Equations. T. W. ANDERSON AND HERMAN RUBIN 

Some Significance Tests for the Median which are Valid Under Very General 
Conditions. Joun E. WausH 

A Direct Method for Producing Random Digits in Any Number System. H. 
Burke Horton anv R. Tynes Smits III 

On a Matching Problem Arising in Genetics. Howarp LEVENE 


A Multiple Decision Procedure for Certain Problems in the Analysis of Vari- 
ance. 


On Distinct Hypotheses. 

An Approximation to the Sampling Variance of an Estimated Maximum Value 
of Given Frequency Based on Fit of Doubly Exponential Distribution of 
Maximum Values. Braprorp F. KIMBALL 

Notes: 

Tests of Independence in Contingency Tables as Unconditional Tests. 
A. M. Mnop 

The 5% Significance Levels for Sums of Squares of Rank Differences and 
a Correction. Epwin G. Ops 

Independence of Non-Negative Quadratic Forms in Normally Correlated 
Variables. Berti, MaTERN 

A Formula for the Partial Sums of Some Hypergeometric Series. HEr- 
MANN VON SCHELLING 

The Variance of the Proportions of Samples Falling Within a Fixed In- 
terval for a Normal Population. G. A. BAKER..........:........ 

The Point Biserial Coefficient of Correlation. JossrpH LEv 

A Note on Kac’s Derivation of the Distribution of the Mean Deviation, 
H. J. Gopwin....... bdo betaeed Lie tata oadinc: 5 Ua suai aeeail cain mobos thie Me 

Correction to “‘Asymptotic Formulas for Significance Levels of Certain 
Distributions’. A. M. PEIsER 

Abstracts of Papers 

News and Notices 

Election of Officers and Council and Revision of By-Laws 

Repor€ on the Seattle Meeting of the Institute. ........................05. 151 

Report on the Cleveland Meeting of the Institute 152 

Report of the President of the Institute 

Report of the Secretary-Treasurer of the Institute 

Mee’ GN Se GG ss RNAs a UT ies oo Bigs « pee hod ty 


Vol. XX, No. 1 — March, 1949 


APR 


1 


4 


t 1949 





THE ANNALS 
OF MATHEMATICAL STATISTICS 


EDITED BY 
8S. S. WILKS, Editor 


M. 8. BARTLETT HARALD CRAMER J. NEYMAN 
WILLIAM G. COCHRAN W. EDWARDS DEMING WALTER A. SHEWHART 
ALLEN T. CRAIG J. L. DOOB JOHN W. TUKEY 
C. C. CRAIG W. FELLER A. WALD 
HAROLD HOTELLING 


WITH THE COOPERATION OF 


T. W. ANDERSON, JR. enennee EISENHART H. B. Mann 

Davip BLACKWELL A. GIRSHICK ALEXANDER M. Moop 

J. H. Curtiss Pave R. Hatmos FREDERICK MostTELLER 

J. F. Daty Paut G. Horn H. E. Rossins 

Harowtp F. DopGr Mark Kac Henry ScHerré 

Paut S. DwyErR E. L. LEHMANN JacoB WoLFOwITz 
Witiram G. Mapow 


The ANNALS OF MATHEMATICAL STATisTics is published quarterly by the 
Institute of Mathematical Statistics, Mt. Royal & Guilford Aves., Baltimore 2, 
Md. Subscriptions, renewals, orders for back numbers and other business com- 
munications should be sent to the ANNALS or MATHEMATICAL Statistics, Mt. 
Royal & Guilford Aves., Baltimore 2, Md., or to the Secretary of the Insti- 
tute of Mathematical Statistics, P. S. Dwyer, 116 Rackham Hall, University of 
Michigan, Ann Arbor, Mich. 


Changes in mailing address which are to become effective for a given issue 
should be reported to the Secretary on or before the 15th of the month preceding 
the month of that issue. The months of issue are March, June, September and 
December. 


Manuscripts for publication in the ANNALS OF MATHEMATICAL STATISTICS 
should be sent to S. S. Wilks, Fine Hall, Princeton, New Jersey. Manuscripts 
should be typewritten double-spaced with wide margins, and the original copy 
should be submitted. Footnotes should be reduced to a minimum and whenever 
possible replaced by a bibliography at the end of the paper; formulae in foot- 
notes should be avoided. Figures, charts, and diagrams should be drawn on 
plain white paper or tracing cloth in black India ink twice the size they are to 
be printed. Authors are requested to keep in mind typographical difficulties 
of complicated mathematical formulae. 


Authors will ordinarily receive only galley proofs. Fifty reprints without 
covers will be furnished free. Additional reprints and covers furnished at cost. 


The subscription price for the ANNALS is $8.00 inside the Western Hemi- 
sphere and $5.00 elsewhere. Single copies $3.00. Back numbers are available 
at $8.00 per volume or $3.00 per single issue. 


COMPOSED AND PRINTED AT THE 
WAVERLY PRESS, Inc. 
BattTimMorE, Mp., U. S. A. 





Entered as second-class matter at the Post Office at Baltimore, Maryland, under the act of March 3, 1879. 





es 


all 


on 





ON THE VARIANCE OF ESTIMATES 
By G. R. Sera 


Columbia University 


Summary. In this paper recent results on the lower bound to the variance 
of unbiased estimates have been brought together. Some of them have been 
extended to sequential estimates and the others have been improved to some 
extent. In the last section a general method for generating a system of orthog- 
onal polynomials with respect to a certain class of weight functions is obtained 
together with a result on the conditions under which the class of unbiased esti- 
mates formed by all functions of an unbiased estimate consists of just one element. 


1. Introduction. 

§1.1. Let X;, X2--- be a sequence of chance variables whose distribution 
depends upon an unknown parameter 6 and possibly also a finite number of other 
parameters. It is assumed that either all the X’s are absolutely continuous or 
that they are all discrete. Let pu(x , 22, +--+ , Xa ; 6) denote the joint probabil- 
ity density function or the probability of (Xi, --- , X) according as the X’s 
are continuous or discrete. Let 6*(x , x2, --+ , 2,) be an unbiased estimate of 
6, where x; , 22, °°: , X» iS a Sequence of observations on Xi, X2,°--, Xn. 

In this paper, we shall make use of the following short forms and abbrevia- 
tions: 

E(X) will represent the expectation of X. 

o(X) will represent the variance of X. 

E(y | x) will represent the conditional expectation of y, given z. 

6* will represent an abbreviation of 6*(a , 12, +--+ , Xn). 

f will represent an abbreviation of f(x; 6) or f(x; 01 , 02, °° , Or). 

px Will represent an abbreviation of p,(x , %2,-°** 5 Zn 3 6) OF Pn(2i, te, °° 

Zn 5,0, ee , Or). 

pw will represent p, for a fixed size sample, i.e.,n = N. 

g will represent an abbreviation of g(6*; 6) or g(67, 62, --+ ,07301,02,°°* ,O7). 

h will represent h(& , & , +++ , Eva | 0*; 0) or h(i, &,-++ , Ev] Or, 02, °* 

67:01 , 02, °° On). 


2 


| n 1 girtiat tir 

$i;,i9,---.ig() Will represent — - —~—-—_~ Dy. 

es . Pn 06;' 0037+ e 067" 
att st---+r 

Gi,ig---,ip Will represent — 


ip 9° 
T 


g 203! a03?-- -a0 


tytigt---+ir 
hi, ,i9,---.i¢ Will represent - ———_————— h 
power a h 9631 0627+ - - 0677 
In case differentiations with respect to one parameter are involved, the last 
three abbreviations will be shortened to ¢;(n), gi and h; respectively. 
1 








2 G. R. SETH 


In §1.1, m is assumed to be a constant equal to N, that is, the sequence of chance 


variables is finite and fixed, consisting of XY, , X2, X3,---,Xy. 
Cramer [1] and Rao [2] have shown that under certain conditions of regularity, 
the variance of 6*(% , %2,-°-: , tw) satisfies the inequality: 
*@O*( )> 

41, %2,°°+, tw “ 

(1.1.1) i C = 
E 
Dn 06 


Cramér [1] has shown that the lower bound for the variance of 
6*(a, ,%2,°** , Xv) given by (1.1.1) is achieved if and only if: 

(1.1.2). There exists a sufficient statistic for estimating 9. 

(1.1.3). The probability distribution g(@*; 6) of the sufficient statistic 
6* (x1 ,%2,°°* , Ly) is of the form 
Ka 


a a 
O(a, 2, +++» tw) ~ 9(6*; 6) 30° 


g(0*;@), whenever g(6*;6) > 0, 
where K depends only upon N and the parameters in the distribution. 

Cramer calls the statistic 6*(a, , 22 ,--+, Xv) satisfying (1.1.2) and (1.1.3) 
an “efficient” statistic estimating @ and we will use the word “‘efficient”’ in this 
sense alone. Bhattacharyya [3] has shown that there exists a lower bound to the 


variance of @*(1 , 22, °°: ,2%w) W hich is higher than or equal to the one given in 
(1.1.1). This lower cca is (mA, that is, 
(1.1.4) a (6*(11,%2,°**,tw)) > (mdr 
where 
| ij | 1 
|| (mA | = | Aaj {| ’ 
and 
se 1 0* Dy dy ed 
(1.1.5) Ai; => = FE (G a6" ag? ; J = ‘ 2, Pe m, 
where ™ is any positive integer. 
Let 6 consist of T components 6; , 62, °-: , O@r,and py(%1,%2,°°:,2n~ 3 47) 
be the same as Il f(ti3 01, 0,°-:*, Or). Further let O(a, te, °°*, tw), 
i=1 
63 (41, %2,°°*, En), °°" 67(a, 5 %,-** , ty) be unbiased estimates of 6 , Oo, 
-- , 67 respectively, with the non-singular covariance matrix || V;; || 
(i,j = 1,2,--:,T7). Cramér [4] has proved that under certain regularity con- 
ditions, the ellipsoid 
7 
(1.1.6) > V%tt; =T+2 


contains within itself the ellipsoid 


7 
(1.1.7) dX Lijtit; = T + 2, 


VARIANCE OF ESTIMATES 3 


where 
(1.1.8) V7? = Was Ul, 
and 

_p(N a . af 
(1.1.9) I,=E = 30, #). 


This result is also implicitly contained in Rao [2]. 

§1.2. Let us now take n as a chance variable determined by a sequential pro- 
cedure. X1, X2,X3,°+- 1s a sequence of chance variables having the same 
probability density or probability f(x; 6), according as X is absolutely continu- 
ous or discrete. The sequential process tells us, after each successive observa- 
tion has been drawn, whether the next observation is to be taken or not. Thus 
n will denote the total number of observations taken by the time the sequential 
process has been completed. Under certain regularity conditions, Wolfowitz 





[5] has shown that if 6*(2 , x, -++ , %,) is an unbiased estimate of 6, then 
*9*( )> 
”: a X1,%2, °° * tn) a c ein 
(1.2.1) En-E( 2 log f(x; 6) 
06 
Furthermore, if 6 consists of J’ components, 6:, 0.,---, Or, and 
. . * . 
65 (2x1 5 Mes *** p Bas 63 (2 , Yo, °** 5 En), °** , Or(t1, te, +++ , Xn) are unbiased 
estimates of 6:, 02, -°-- , 6r respectively, Wolfowitz [5} has proved that 
7 
(1.2.2) > Lijtit; = T +2 
i,j=1 
is contained within the ellipsoid 
(1.2.3) V" tt; =T +2, 


t,j=1 


where 





d log f a log f sa 
i; = En-E(-* —— = 1,---,f. 
Ii; n ( 30, 30, )” tJ 

Blackwell and Girshick [6] have shown that the lower bound given by (1.2.1) 
for the variance of an unbiased estimate of 6 is attained only for the sequential 
process for which Pr(n = N) = 1, if the probability density function f(x; 6) of 


X is such that E(XY) = @and 2, + x2 + 23, °°: + 2y isa sufficient statistic for 
all integral values of /, for estimating 0; 21 , 22, --- , {» being M independent 


observations on the chance variable X. 

In this paper the following results have been obtained. The specific condi- 
tions under which the results hold are stated at their proper places along with 
the results: 

(1.3.1) The lower bound in (1.1.4) is valid when n is considered a 








4 G. R. SETH 


chance variable determined by a sequential procedure instead of being a fixed 
number N. 
(1.3.2) The concentration ellipsoid defined in (1.2.3) contains within itself 
another ellipsoid 
T 


a Mij tit; = T + 2 


1.7= 
where u;; is given by (3.1.18), which in turn contains the ellipsoid given by 
(1.2.2). 


(1.3.3). The Blackwell and Girshick result [6] for the achievement of the lower 
bound for the variance of unbiased estimates given by (1.2.1) has been extended 
M 


to the case where the probability density (or probability) II f(x: ; 0), for all 
t=] 


fixed M> N, where N is the least value for which Pr(n = N) + 0, has an 
unbiased “efficient” estimate for @ in the sense defined by Cramer. This is 
illustrated by two examples of Wald sequential procedures. 


(1.3.4). Let N be fixed and py(m, r2,---, tv; 0)-|J| = g(6*; 6) 
h(é&i, &, °**, €v-+|6*, 6), where J denotes the Jacobian of the transfor- 
mation from 2, %2,°--, ty to 6*, &, &,---, v4. Here g(6*; 6), and 


h(é& , 2, °°+ , €v-1| 6*; 0) are respectively the probability density function (or 
probability) of 6* and the conditional probability density function (or prob- 
ability) of £, &,--- , &v-1 for a given value of 6*. 

The necessary and sufficient conditions under which the lower bound for the 
variance of unbiased estimates given by Bhattacharyya [3] may be achieved are 
that there should exist a statistic 6*(2 , x2, +--+ , tw) such that: 

(a) hy, he, +++, Am are linearly dependent considered as functions of &, 

f&,--°: ,€v-1 for given values of 6 and 6*(7 , 22, --+ ,Zy) and 

(b) the probability density g(6*; @) of 6*(11 , x2, --- , rw) satisfies the follow- 

ing equation: 
6*(a1, 22, -++, av) — 6 = DF 2 9(6*; 0), 
where K; are independent of the 7, 12, %3,°:: , ty. 

Equivalent conditions for the multiparameter case have also been given. 

(1.3.5). The following properties of ¢:(n), ¢2(n), --- are derived: 


(a) Under certain conditions ¢,(N), ¢2(N) --- form a system of orthogonal 
polynomials in ¢,{N), the weight function being py(m1, %2,-°-- , tw ; 4). 

(b) » K.¢i(n) cannot be a function of 2, 22, -°-: , %,, independent of @ 
i=] 
except for the constant zero. 

(c) If @*(a1, 22, °--, Xn) is linearly dependent upon ¢,(n), then no other 
statistic except of the form aé*(x, , x2, --- , %,) + 6b where a and b are 


constant independent of 6, can be linearly related with ¢,(7). 
(1.3.6). If a) 6*(a , x2, --- , v) is an unbiased estimate of @ and b) if among 


we» 


al 


)). 


er 
re 


ng 


VARIANCE OF ESTIMATES 5 


all functions of 6*(x; , x2, --+ , vy) which are unbiased estimates of @ with finite 
variance, 6* is the one with the least variance and such that the set of poly- 
nomials with respect to the distribution function of 6* is complete, then there is 
no function of 6* having a finite variance which is an unbiased estimate of 0. 


2. Estimation of a single parameter. 

§2.1. Let X;, X2,--- and pu(%, 22, --+ Ly ; 4) be as given in the first para- 
graph of (1.1). Let Q be the space of all possible infinite sequences (w) of obser- 
vations 21, %2,-°-+:. Let there be given an infinite sequence of Borel measur- 
able functions (2), Bo(a, , 22), «++ , Bi(a1, 2. %3,°°* , 23) +++ , defined for 
all observable sequences in 2 such that each takes only the values zero and one. 
We further assume that everywhere in Q, except possibly on a set whose proba- 
bility is zero for all @ under consideration at least one of the functions ®,(2), 
$(x; , 22), + ~ takes the value of one. Let be the smallest integer for which 
this occurs. Thus n(w) is a chance variable. The sequential process is then 
defined as follows: 

Take an observation and find #,(z,). If it is unity, the sampling process stops; 
otherwise continue sampling. If a second observation is taken and the value of 
$,(x; , 22) is unity, the process stops; otherwise continue sampling, and so on. 
In general, if after taking j observations 


(21 , 2, °°* , Xi) = Ofor< = 1,2,---j- Rt 


and ®;(11 , %2, °°: , 2;) = 1, sampling stops; otherwise it is continued. We 
will denote by R; , the set of all points (1 , x2, --- , x;) for which the process 
stops with the jth observation. 

Let 6*(x1 , 2, +++ , Xn) be a statistic whose expectation is a real valued func- 
tion y(@) of 6. The development proceeds on the assumption that 
Pau(%1, %2,°** , Lv; 9) isa probability density function. The result is equally 
valid if pu(a1, %2, °**, 23 9) is the probability of discrete variables X, , 
X2,-°-:, Xw provided that integration is replaced by summation whenever 
this is required. Further the phrase ‘‘almost all points” in a Euclidean space of 
any finite dimensionality is understood to mean all points in the space with the 
following possible exceptions: 

(a). A set of Lebesgue measure zero where pyu(% , 22, °** , Lar; 8) is the prob- 
ability density function; 

(b). The points which belong to the set Z, where pu(a1, 42, °** ,%m 30) is the 
probability function of the discrete chance variables X,, X2,--:, Xa. The 
set Z consists of all points ‘7; , x2, +--+ ,%) such that py(a1,%2, °° tu; 9) = 
0 identically for ail @ urdier consideration. 

§2.2. Conditions of regularity. We will postulate the following conditions to 
be satisfied by pu(a1, 22, °°: ,U4 36) and 6*(a,,272,°-+ ,2n). 

(2.2.1). O*(a,, x2, --- Xn) has an expectation y(@) and a finite variance. All 
the derivations of y(@) are assumed to be finite. The parameter @ lies in an open 
interval D of the real line. D may consist of the entire line or an entire half line. 








G. R. SETH 
(2.2.2). The derivatives 


a’ D: 
ors (¢ = 1,2, ---, m), 


exist for all 6 in D and almost alla ,a,-+--,2yin Ry andforall M. Wedefine 


whenever pu(%, %2,°**, U3; 9) = 0; thus, 


1 a’ Du 
— ~— = ¢(M 
Pu 06° dM) 


is defined for all 6 in D and almost all (q,, 22, +--+ ,2y4) n Ry. 
(2.2.3). For any integral j there exists non-negative L-measurable functions 
T(x; ,%2, °° ,2@;), (¢ = 1,2, --- , m), such that 


at 
, 0 ' 
(a) O*(ay pe © i) oP ile » 2, °° 5 Lj; 6) | < T(x 7 WRG O** g 23), 


for all 6 in D and almost all (11, %2,--: ,2;) n R;. 


: 
(b) Ti(t1,22,°°*,2%;)) Ddu, (i = 1,2 
u=1 


2,---,m), 
Rj 
are finite. 


P 
(2.2.4). Let t;(6) = / 6" (a1, 22, °°* , x)pa1, 22, °° ,2;30) [] dx. 
2. u=l1 


d 


We postulate the uniform convergence of 


= dd’ : 
2X dei t;(), (2 = i, 2, ee m) 


(the existence of = (t;(@)) is assured by the assumption (2.2.3).) 

(2.2.5). There exist functions S,(a , x2 ,--+- ,x;) forevery j, (¢ = 1,2,---,m), 
such that when 6*(x,,27,°-:, x;) and T;(x,,%2,-°-:+, 2;) are replaced by 
unity and S,(7,, r2,--- , x;) respectively, conditions (2.2.3) and (2.2.4) still 
hold good. ' 

(2.2.6). The covariance matrix of ¢;(n) (¢ = 1, +--+, m) exists and is non- 
singular for almost all @ in D and almost all (a, 22, °-+ , %n). 

§2.3. Let us consider the sequential process mentioned in §2.1 and the func- 
tions 6*(2, , 72, °** , Xn) and pula, %2, +++ , Xa 3 9) which satisfy the regularity 
conditions in §2.2. We will now find a lower bound for the variance of such es- 
timates. 


Let us examine 


(2.3.1) F=E (sn, i tn) — (6) — >" K.oi(n)) ; 
i=] 


VARIANCE OF ESTIMATES 7 


where K; (¢ = 1, 2, --- , m) are independent of (x1, 42, ---* %n). Now (2.3.1) 
can be written as 


F = o (6*(2; ee #,)) —@2 X K; E6* (xy peag Ss tn)oi(n) 


(2.3.2) 
+ 2y(8) > K;E¢dn) + D0 Ki KjXi, 
t,j=1 
where 
— E(¢;(n)¢;(n)) (i,j = l, 2, reer ae m). 
Now 


(2.3.4) E(O*(a1, 22, +++, ¢n)oi(n)) = EL O*(x1, %2, ***, Xj) a Sar TT dew. 
7=1 


u=1 


We also know that 


oc 7 
(2.3.5) x 6*(21, 22, °-+, t;)p; [] dxu = (6). 
?7= Rj u=1 


Differentiating both sides of (2.3.5) 7 times (¢ = 1, 2, - 
cause of conditions (2.2.3) and (2.2.4): 


- 
(2.3.6) > | O*(21, 22, °**, 25) ) 2 Pi Il av. _ 70) (i = 1,2, +--+, m). 
7=1 4R; 


m) we have, be- 


~ 





00® sat de’ 
From (2.3.4) and (2.3.6), we obtain 
(2.3.7) E(6*(21, 22, +++, tn)oi(n)) = . (6). 
Differentiating 


(2.3.8) 1 = Ef D; an, 


u=1 


i times ( = 1, 2,--- , m) with respect to 6, we obtain because of conditions 
(2.2.5) 


(2.3.9) 0 = > | . P; Il a (¢ = 1, ---,m). 
j=1 JR; 06 ol 


(2.3.8) is valid on account of the type of sequential process (2.1). Now 


(2.3.10) E(¢(n)) = v/s . ee Il dz.,, (i = 1,-++,m). 


ul 


By (2.3.7) and (2.3.10), (2.3.2) reduces to 
(23.11) F = 0 (6*(x, 22, --+,2%n)) -2>_,K; _- + = KEK ida. 
t=1 , 


Now |] A;; || being non-singular on account of condition (2.2.6), we get just 
one set of values of K’s which minimize F. These values are given by 


(2.3.12) K;= 2 wet bine ; 











8 G. R. SETH 

where 

(2.3.13) || wa I" = | Aaj ll, (i, J = A, 2, crane m). 
Putting the above values of K;(j7 = 1, 2, --+ , m) in (2.3.11), we obtain 





— . ij d’ (8) d’ (0) 
(23.14) F = o°(6*(21, 22, +++, 20)) — 2 amd" GE + aa 


Hence, F being non-negative by (2.3.1), we have 


‘ ‘ syis , dy) d’ (6) 
* “a See + See 
(2.3.15) o°(6*(1,, 22, °**, tn) > X oil - — 
Thus R.H.S. of the above inequality gives the lower bound to the variance of 
unbiased estimates of 7(@).’ When (0) = 6, the above reduces to 


(2.3.16) o°(0* (21,22, °°*,2n)) > mr - 


When m = 1 and p,.(a%1, 22, °°+ ,%n 59) = II f(a; ; 6), (2.3.16) reduces to 


t=] 


1 


a aan 
En-E((Stogs(x;0)) ) 


which is the result given by Wolfowitz [5]. 

When n, the chance variable, is constant and equal to N, then (2.3.15) and 
(2.3.16) correspond to those given by Bhattacharyya [3]. Although the con- 
ditions of regularity under which Bhattacharyya proves his results are not clear 
from his paper, they are likely to be slightly different from those in §2.3, as the 
results in [3] are obtained only for a fixed size sample. 

§2.4. We will now investigate the necessary and sufficient conditions under 
which the lower bound given in (2.3.16) is actually higher than that given in 
(2.3.17). 

We can easily see that 


o (6*(21, oe 2 Xn)) > 





(2.3.17) 


1 


2.4.1 md" = SG Renn)? 
( ) ) Aun(1 a Rj.23..-m) 


where R.23...m is the multiple correlation coefficient between ¢:(n) and ¢2(n), 
$3(n), a m(n). 

The excess of the lower bound given by (2.3.16) over that when we use m = 1 
is given by 
1 1 
2.4.2 ———__—_—_.——. —- — 
(2.4.2) Mall — Rien) as 


1 Under certain weak restrictions, an optimum lower bound to the variance of unbiased 
estimates has been obtained by me along the lines of a similar result for fixed size samples 
in an unpublished paper by A. Wald. Independently C. Stein has obtained the same result 
in a paper not yet published. 


les 
alt 


VARIANCE OF ESTIMATES 9 
which is further :equal to 


2 
1-23---m 

= Au(i — Ri.23..m) 

Thus the lower bound for the variance of unbiased estimates of @ is obtained 
by using m > 1 is higher than that obtained by employing m = 1 if and only if 
Rj.23...m 18 not zero for some m > 2. This is equivalent to the condition that for 
at least one 7 > 2, Ax; , the correlation coefficient betwen ¢;(n) and ¢,(n) (¢ > 1), 
is different from zero. Suppose further that we have used m = a and that we 
wish to find the increase in the lower bound if a were replaced by a + 1. The 
increase in this case is given by 


2 

P1(a+1) -23---@ 
wae An(1 — Ri.s3...@+1)) 
where 1(a+1).23-..2 1S the partial correlation coefficient between ¢;(n) and ¢a4:(n) 
keeping ¢2(n), --- , a(n) fixed. It is greater than zero if and only if pj¢0+1).23...2 
is not equal to zero. 

§2.5. If pala, t2,°°* , Ln 3 A) also depends upon a finite number of other 

parameters 62, 63, -°-- , 97, then a lower bound higher than or equal to that 
given in (2.3.16) can be obtained by using 


Ki, .i2-.-ip  Piqeige--.tp(M) instead of a K;,-¢:,(n) in (2-3. 1). 


iytigt:+-tirsm ty=l 


The lower bound in this case is given by (3.1.14) (see section 3) by taking s = 1, 
that is, 


(2.5.1) o (6* (2 9 MBs °°* 9 In)) = c(l, 1) 
where C(1, 1) is the element in the first row and first column of the inverse of W 
defined in (3.1.9). 

The result for n = N, N fixed, is obtained by Bhattacharyya [3, 1947]. Let 
us illustrate it by an example. Take samples of fixed size N. Suppose we are 


required to find the lower bound to the variance of unbiased estimates of 4, 
in the normal population 


1 alias 
(2.5.2) f(x; 0, 0) = ae. Sgn eta) 12 


on the basis of N independent observations 7 , %2,--:,2y. The lower bound 
for the variance of the unbiased estimates of 6, , when we use 


9 


> Kj, + ¢:,(N) in (2.3.1) __ is given by =e 
i= ! 


However, if 7 Ki,.i; * %i;.i.(N) is used, the lower bound, by the help of 


tytio<2 
(2.5.1), is found to be equal to 26;/(N — 1). In fact there exists the statistic 


: (a; — )° 


i=1 


N-1 








10 G. R. SETH 


N 
° 2 - m ay 
whose variance is equal to 20;/(N — 1) where & = 7 y" Thus the use of 
i=l i 


Zt, Rat - @:,,:.(N) brings into relief the unbiased estimate with the 


tytigs2 


least variance. 


3. Multi-parameter case. In this section we will prove the result mentioned 
in (1.3.2) of §1.3. 

§3.1. Let @ consist of T components (0, 2, °°: , Or) and 61, 02,°°° , Or be 
unbiased estimates of 0:, 02, °:*+ , Or respectively. Also, let a sequential process 
of the type described in §2.1 be given. We postulate the following regularity con- 
ditions: 

(3.1.1). The covariance matrix || V;; || of the estimates 6; (i = 1, 2,---, 1) is 
non-singular in D, where D is an open interval of the T-dimensional parameter 
space. 

(3.1.2). The conditions of section (2.2) are satisfied for each one of 
O3(¢ = 1,2, ---, T) and diyig...nig(M), (a + te + +++ + ip < m). 

(3.1.3). The covariance matrix of $i,,i2,...i7(M), ti + 12 + ++ + ir < m exists 
and is non-singular. Under the assumptions (3.1.1)-(3.1.3), we prove the result 
(1.3.2) in section 1.3. 

ProoF: Using the same arguments of §2.3, we obtain 
(is = 63,(8 = 1,2,---, 7), 
(3.1.4) E(0;(%1,22,°°° » Xn) *Pizsig--i7(m)) = 1, 4 : 0 si 
\g = 1,2,---,T 


(3.1.5) = 0 otherwise. 


Let the covariance matrix of 6; (j = 1,2,---,8;8 < T) and ¢i,,4.,....87(%), 
(4; + i2 + ++: +27 < m) be given by 


‘A B 
(3.1.6) U =| 

|B’ W, 
where 
(3.1.7) A = || Va; ||, 4,9 = 1,2, °° , 83 os 7; 
(3.1.8) _ B= {I,0]|; 
(3.1.9) and W = covariance matrix of the set 

[Pissig,..nt7(2)3 41 + te + eos + tr < mi, 


arranged such that the jth term in the leading diagonal is given by 
(3.1.10) Eis é5,..427(”)), wherei?; = 1,748 = 0,8+)j, (J = 1,2,---,T), 


and B’ is the transpose of B. 
As U is positive semi-definite, we have 


(3.1.11) |U|>0. 


y 


VARIANCE OF ESTIMATES 11 


The above can further be reduced to 
(3.1.12) \'W\|-|A — BW'B’| > 0, 


which leads to 


(3.1.13) |A — B+ W-- B’| > 0, as W is positive definite. 
By the use of (3.1.8) we obtain from above 
(3.1.14) |A—C|>0 


° — — 
where C is the top left part of W ’, consisting of s rows and s columns. 
Let us now consider the matrix 


(3.1.15) | Viz a I, (i, j = 1, 2, oe T), 


where || .v;; || is the topleft part of W~ consisting of T rows and T columns, and 
is equal to 


(3.1.16) || Wa — WeWe2 War |\", 
when W is written as 
Wu Wr | 
Wa Wo 


(3.1.17) W = 


? 


where Wi: has T rows and T columns. 

By the repeated application of (3.1.14), we are led to the conclusion that all 
the leading minors of the matrix in (3.1.15) are either positive or zero. Hence 
the matrix in (3.1.15) is semi-positive definite. 

If now we put 


(3.1.18) I} mag || = |] vez 17 
we obtain 
(3.1.19) Il wig — V" || 
to be semi-positive definite. Thus the ellipsoid 
T 
(3.1.20) dX Ve-tet; = T+ 2 
1,7=1 


contains within itself the ellipsoid 


7 
(3.1.21) D> wy tet; = T+ 2 
t,j=1 
Cramer calls the ellipsoid in (3.1.20) a “concentration’’ ellipsoid. 


We will now show that the ellipsoid given by (3.1.21) contains within itself 
the ellipsoid 
(3.1.22) Lo Litt; = T +2 


t,j=1 








12 G. R. SETH 





where || J;; || is the information matrix given by Wy in (3.1.17). We will prove 
the above by showing 
(3.1.23) Hl I; — Bij |, (i, j _ 1, = ag a), 


to be semi-positive definite. 
We obtain, from (3.1.16) and (3.1.18), 


(3.1.24) || Kij || = Wu = WW Wa ’ (4,7 = :, 2, ree T). 
From the above it follows that 
(3.1.25) || Ii; — Mis || = WwW Wu ° 


Thus the matrix on the right hand side is semi-positive definite since W2 is 
positive definite, we see that the ellipsoid (3.1.21) contains within itself the ellip- 
soid given by (3.1.22). This proves the assertion made in (1.3.2) of §1.3. It 
may be seen that (3.1.22) is strictly contained in (3.1.21) if and only if Wy. + 0. 
It may be mentioned that in this section as well as elsewhere, T' + 2, appearing 
on the right hand side of the equation of an ellipsoid, can be replaced by any 
positive constant. Also the ellipsoid in (3.1.21) depends upon the choice of m 
and it can be shown that for any two positive integers m, m: (m, > m,) the el- 
lipsoid for m = m, contains within itself the one for m = m. 

§3.2. In general, let 03 (41, 22, °°° , Xn) be statistics whose expectations are 
vi(01,02,°-: , Or), (@ = 1,2, --- , T), the latter being assumed to admit partial 
derivatives of all possible orders. Under the postulates enumerated in §3.1, 
we see that the ellipsoid in (3.1.20) contains within itself the ellipsoid 


7 


(3.2.1) x S.etet; = T +2 

where 

(3.2.2) | Sii li = |] RWR’ rs *,j3 = 1,2,---, T, 
and 


Qtitiet: stig 


(3.2.3) . 7 a6)! agi? 4 o« agi? 7 (1 ? 62 oe Or) | ; 


(j =1,2,°°°,Tjat+it+ +++ +ir < m), 


where j and 2; + 72 + --: + 77 indicate the number of the row and the column 
respectively and is arranged to correspond to the arrangement of W, where W 
is the same as given in (3.1.9). 


4. Achievement of the different lower bounds. In §4.1 we will demonstrate 
the desirability of finding a higher lower bound to the variance of sequential 
estimates than that given by Wolfowitz, by giving two examples in which the 
latter is not achieved. From §2.4 it is clear that this will be so if E(@,(n) - ¢;(n)) 
is not zero for at least one value of i > 2. We will demonstrate that this is true 


re 
al 
l, 


In 


W 


te 
al 
1e 


)) 


VARIANCE OF ESTIMATES 13 


fori = 2. In §4.2 we show that if “efficient” statistic exists for all M > N, 
the bound is achieved only in the case when the sample size is fixed. In §4.3 
we obtain necessary and sufficient conditions for the attainment of the bound 
given in (1.1.4). In §4.4 we discuss the conditions under which there exists a 
“concentration ellipsoid” which coincides with the ellipsoid given in (3.1.21) for 
samples of fixed size N. 

§4.1. Ex. 1. The Wald sequential procedure for testing 6 = 6, , against @ = 
@. in a normal population 


1 
V 20 





e tem? 


is given as follows: If 


8 
(4.1.2) B< 2 (%-84%) Aye =1,2,--,5-0, 





i=1 
and 
j 
(4.1.3) a (« «% = ‘ is either >A or <B, 
t=] 


we cease sampling and make a decision. Here A and B are constants fixed by the 
probability levels of making a correct decision. 

Let us denote the set of points satisfying (4.1.2) and (4.1.3) by R;. In this 
case 


n 


(4.1.4) di(n) = >> (14; — 60) = Zn — nO, where Z, = = Li. 


t=1 i=l 


The above is differentiable with respect to 6. On differentiating we have 


(4.1.5) go(n) = (Z, — nd)” — n. 

Now 

(4.1.6) E(¢,(n) - do(n)) = E(Z, — no)’ — E(n(Z, — né)). 
By theorem 7.3, Wolfowitz [5], 

(4.1.7) E(Z, — no)’ = En- E(X — 6) + 3E(n(Z, — né)), 


where X;has the distribution given in (4.1.1). As E(X — 6)* is equal to zero, 
(4.1.6) reduces to 


(4.1.8) E(¢,(n) - do(n)) = 2E(n(Z, — né)). 


We willfnow show that right hand side of (4.1.8) is not identically zero in 6. 
Let us consider 


(4.1.9) E(n) = . I. aaa" | ex (- : (7; — a) | [aew. 








14 -G. Re SETH 


Differentiating with respect to 6, we get 


(4.1.10) 2 (Bm) => f [exp (-: ¥ (es 0) iT a.. 


7=1 YR; - (Qr)i? u=1 





The righthand side of the above equation being equal to E(n(Z, — né@)), the lat- 
ter does not vanish identically in 6, because the lefthand side is not identically 
zero. The step from (4.1.9) to (4.1.10) can be easily seen to be valid. 

Ex. 2. The Wald sequential procedure for testing p = p; against p = pe ina 
binomial distribution, where p is the probability of the event occurring, is given 
as follows: If 


(4.1.11) B<)> (@—d)<A, 8s=1,2,---,j—-1, 
t=1 

and 
7 

(4.1.12) > (ai: -—d) iseithr >A or <B, 
t=] 


where d is given by [log (1 — p1)/(1 — pr)]/log [(p2(1 — pi)/pi(L — pz)], the 
process stops with the jth observation and a decision is taken. Here, x; is the 
characteristic function of the event at the 7th trial, that is: 


x; = 1, when the event occurs at the 7th trial; 


= 0, otherwise. 


Let us denote the set of points satisfying (4.1.11) and (4.1.12) by R;. In this 
case we find 


(4.1.13) Blox(n)-42(n)] = y=, + Eln(Zn — np»), 


where Z, = >, 2;. We have now to show that the righthand side is not iden- 
i=1 


tically zero. Differentiating 





(4.1.14) E(n) = 2, Dd. jp*(1 — p)”* 
j=l Rj 
with regard to p, we obtain 
(4.1.15) 2 (mm) = DD RAMP). pe — pyrte. 
dp j=1 ky Pl — p) 
The righthand side of the above is the same as 
(4.1.16) a E(n(Z, — np)). 
p(l — p) 


Thus, the lefthand side of (4.1.15) being not identically zero, the same is true for 
(4.1.16), and consequently the bound given by Wolfowitz is not achieved in this 
case. 


at- 
lly 


1a 


the 
the 


his 


en- 


for 
his 


VARIANCE OF ESTIMATES 15 


The step from (4.1.14) to (4.1.15) is valid as 


(4.1.17) x (Di-77a a p)*?i _ ai ~ is) 


p(l — p) 





is absolutely and uniformly convergent. 

§4.2. Let 6* be some unbiased estimate of 6, where x,’s are successive inde- 
pendent observations on the chance variable X having the probability density 
function or probability function f(z; @). We adopt a sequential procedure men- 
tioned in §2.1 satisfying the regularity conditions in §2.2 and also postulate 
the following: 

(i) For all positive integral values of M > N 


II f(a: 3 8) 


t=1 


Il 


Pu(%, 2, coe ig Lu 3 0) 


possesses an ‘efficient’ estimate for 6, where N is the least value of n for which 
Prin = N) + 0. 


(ii) E(n) exists and admits derivatives up to the second order with respect 


to 6. Furthermore, aoe”) is either zero for all 6 under consideration or is 


never zero. 


Under the above conditions the Wolfowitz lower bound for the variance of unbiased 
estimates 1s achieved only when Pr(n = N) = 1. 


Proor: This bound will be attained if and only if there exists an unbiased 
estimate 6* of @ such that 


(4.2.1) E(6* — 6 — K¢,(n))” = 0, 
that is, 
(4.2.2) 6* — 6 = Ko,(n) 


with probability one, where K is independent of all z;’s and n. As there exists 
an ‘efficient’ estimate, say ¥(M) for all M > N, we have 


(4.2.3) GE mk ermguage anemia 


M-E | (% log f(z; ») | 


forall J > N. From (4.2.2) and (4.2.3), it follows that 


(4.2.4) a*—@6=K-n- (Wn)—6)-E (% log f(z; ») | . 
Now as 


1 


En: E (3 log ste )) | 


(4.2.5) K= 








16 G. R. SETH 


we have 
(4.2.6) gt —9 = Mm) — 4) 
En 
If E(n) is independent of 0, then from (4.2.6), we obtain 
(4.2.7) n/E(n) = 1, 


that is, n is constant with probability one and the sequential procedure reduces 
to a fixed size sample case. If E(n) is not independent of 6, then differentiating 
(4.2.6) with regard to 6, we obtain 


(4.2.8) s ee <@ S, (En) 


1 = ms 


(En)? En’ 
As £ (En) is not equal to zero for any 6 under consideration, substituting the 


value of ¥(n) from (4.2.8) in (4.2.6), the latter takes the form: 





En — 
(4.2.9) oe -9=——. 
: (En) 
dé 
Differentiating the above with respect to 6, the result is: 
En — 2 
(4.2.10) “1 2 + py - & (a) + 1. 
d de 


2 


Now if <(En) = Q, then (4.2.10) is not valid, thereby ccntradicting (4.2.2). 


2 
If £ (En) + 0, then rearranging (4.2.10), we obtain 


dé? 
d 2 
2(4 En) 
(4.2.11) n=-— aS + En, 


that is, n is a constant with probability one. This proves that Wolfowitz bound 
is achieved only in the case when n = N with probability one. This generalizes 
the result of Blackwell and Girshick [6] to the extent that in [6] the existence of 
an efficient estimate is assumed for all integral values of M instead of M > N, 
as assumed here. Moreover the proof given here, with slight modifications, is 
also valid when the successive observations are not independent. 


2 In [6] the assumption that ‘‘z,; + zz +--+ + ry be a sufficient statistic for all M”’ 
really amounts to the postulate that ‘‘7; + z2+ --- + 2, be an “effcient”’ statistic for all 
M,’”’ when we restrict ourselves to probability density functions satisfying the conditions 


sy 
é 


given by Koopman in [7]. 


VARIANCE OF ESTIMATES 17 


§4.3. Let us consider a sample of fixed size N. Let 6* together wit the 
probability density function py satisfy the following regularity conditions: 


(i). There exists a transformation T from (x , x2, --- , xv) to the variables 
& = &(t1,22,°++,2w), 6* = O*(t%1,%2,-+:, ty), 
(4.3.1) 
i= 1, 2, po N= 1, 
eS such that 
1g (a). The functions & are everywhere unique and continuous, and have con- 
tinuous partial derivatives 
a& 00* ,. 
se om (i = 1, 2, -- N—-—1,u=1,2, +++, N) 
in all points (a , 22, --+ , yw) except possibly in certain points belonging 
ne to a finite number of hyper-surfaces. 


(b). The relation (4.3.1) define a one-to-one correspondence between the 
points x = (m1, %,-::, %v) and y = (&, &,--:, v4, 6*) so that 
conversely 7; = ni(é&i, &,--* , v-1, 6*) where n; are unique. 

(ii). There exists partial derivatives of g(6 *;0), h(i, $&,--- , v1 | 6*; 6) with 
regard to @ of all orders up to and including m, where m is some finite integer. 
The variances of 6*, h; and g; - g;,7,j = 1,2, --- , m, are finite, where h; and g; 
are defined in section 1. 

(iii). There exist functions 


{= 1,2,---,m 
2 a 1, 2,3 ) 








>). such that 
| o*n~ | 
| a | < Ta(ai, 22, +++, tw); 
a‘g 
— << T; o* ; 
S| < Talo) 

| t | 

| dh 

| -| < Tislfi, &, +++ , Eva 5 ®), 

| 06 | 
a for all 6 in D and for almost all (x; , x2, --- , xv) where D is an open interval. 
” Further 
of ° * 
N, [ Tales, 22, --+, aw) [] dz, 
is — 

N-1 
i Ti2(6*) dé* and | Tisléi , &, °°* v1; 6*) I] dé. 
i=] 

yg? 
all are all finite, the range of integration, in each case, being the whole range for 
ns the arguments indicated. Then the necessary and sufficient conditions that the 


variance of 6* equals the lower bound given in (1.1.4) are 








18 G. R. SETH 


(iv). hi, he, ++ , hm are linearly dependent considered as functions of & , & , 
- , £v_1 for any given 6* and @, and 
(v). The probability density function g of 6* is of the form 


* — 9 = >» Ki gi 
i=1 


where K; may depend upon @ and N only. 
The proof here is given when py is a probability density function. It is also 
valid with slight modification when py is the probability of discrete variables. 
Proor: Let J be the Jacobian of the transformation JT in (4.3.1). Then 
because of conditions (i) and (ii) above, we have, 


(4.3.2)  pw(ai, 22, --+ , tw; )-| J | = g(0*;0)-A(é, f2, «++, Ewa | O*; 6) 
Further 


(4.3.3) ie sf, +++, Eva | & ; 8) i dé, = 1, 

the range of integration being the space of 4, f&,---,éy1. Differentiating 
the above 7 times under the integral sign, it follows that 

(4.3.4) Eth; | 6*; 6) = 0. 

Similarly we have 

(4.3.5) E(gi- hi) = 0 


as the expectation of the quantity on the L.HS. is finite by virtue of (ii). More 
generally, we have 


(4.3.6) E(F(6*) - h:) = E[F(6*) - E(h; | 6*)] = 0 

if E(F(6*) - h;) is finite. Let us now examine 

(4.3.7) E (« —0—)> K; o(N)) 
i=1 


where K,¢;(N) can also be written as 


(4.3.8) K; @ a (;) higi- a “as — is) ‘ 


Now (4.3.7) can be put in the form 


(4.3.9) B (0 -——i-— ze K:g — Za L; - hs) , 
i=1 


t=] 


9 
? 


where 
(4.3.10) L; _ 2 K; : (’) * Jj-i ; (a = i, 2, ae m), 


clearly depend on @ and 6* only. 





VARIANCE OF ESTIMATES 19 


By virtue of (4.3.4-4.3.6) and F(6*) involved in (4.3.9) being such that 
E|F(6*) - hii = 1, 2,--+ , m) is finite because of (ii), we can further reduce 
(4.3.9) to 


(4.3.11) E (« -~ i= > K; a) +E lz (( L; he) | )| ° 


The lower bound will be achieved if and only if the above expression is zero, 
the necessary and sufficient conditions for which are: 


(4.3.12) *—¢9=)) Ki- gi, 
i=1 

and 
(4.3.13) dX Lik; =O in &,&, +++, Eva 
for any given values of 6* and @. 

(4.3.13) is equivalent to the condition that h; , (¢ = 1, 2, --- , m) are linearly 
dependent considered as functions of & , &, --- , &v-1 for any given values of 6 
and 6*. 


When m takes the value one, the above reduces to the Cramer conditions for 
the existence of an “efficient’’ estimate. 


§4.4. Multiparameter case. Let 6;, 62, °°: , 07 be the unbiased estimates 
of 6:, 02, ++: , Or in the probability density function 
Pr(t1, 22, ante , tn 5 4, 62 , tn Or) 
and the regularity conditions of §4.3 are satisfied when 6* and oi (¢ = 1,2,---, 


m) are replaced by 6; (j = 1, 2,---, 7) and 


gitiet- tir 


——_—-__ autite::+ +irim 
a6 agi? --- agit iii . ) 


respectively. Further let 
Pr(ti, Z2,°°*, tv; 61, O2,°°*, Or) - | J | 
(4.4.1) = g(@r, 02, -°--, Or; 01, 02, °**,5 On) 
+ A(Er, & +++ y Eva | Ot, 02, °° , Or) 


where g and h are respectively the joint probability distribution functions of 
Or, 63 we 6 and the conditional probability distribution of &, &,--: , v4 
for a given set of values of c.&.*-*, 67. In order that the ellipsoid (3.1.20) 
coincides with the one given by (3.1.21), it is necessary and sufficient that the 
following be satisfied for each t(t = 1, 2,--- , T) 

(4.4.2) E G —&- 


9 


(t) N)\2 
} oe . Oivinertt ) = (, 


iptiot:--tirsm 








20 G. R. SETH 


Now reasoning similar to that in §4.3, we conclude from the above that the 
necessary and sufficient conditions are: 


There exist T independent linear combinations of 
(4.4.3) Piette xc 3 utetes> t+irsim 
which vanish with probability one for any given values of the sets 
(0:, 02, +++ , Or) and (01, 02,---, On), 
and 


(4.4.4) 01-4 = a. Petco dei 


iptigt---+irsm 


oe ¢=1,2,---,T, 
where the K’s do not depend upon 6; and és. For T = 1, the above reduce 
to the conditions in §4.3. We will now give an example in which (4.4.3) and 
(4.4.4) are satisfied. Let 


. 1 1 £ J 
(4.4.5) Pu(t1,%2,°°*, tn 30,0) = (270,)*7 | exp ~ 20, . » (a; — a) | 
We have 
(4.4.6) a = > (x; —2)"/(N — 1), 

‘ 
(4.4.7) 6: = >, 2,/N =, 
i=1 


unbiased estimates of 6, and 62 in (4.4.5). The joint distribution of 6; and 
6: is given by 
g(01, 02; 1, &) = C 


4.4.8) a Pw AP EM on P - a 
( -exp| N(6; _ or 1)6; ont" | 


It can be easily seen that the condition (4.4.3) is satisfied, and the estimates 
themselves can be put in the form 


265 lag 6 =e 

A. j= a+. .- NIN = 1) 

(4.4.9) 1 1+ N—1 900, N(N—1) g 26;’ 
= * 1 ag 
(4.4.10) 2 = & +a g 30." 


It is thus seen that the ‘concentration’ ellipsoid for 6:, 82 coincides with the el- 
lipsoid (3.1.21) for m = 2. On the other hand if we use m = 1, the condition 
(4, 4.3) is satisfied but not the one in (4.4.4), as can be seen from (4. 4.9), and thus 
the concentration ellipsoid strictly contains within itself the one given by the 
information matrix. It may be noted that for m = 1, the condition (4.4.3) 





VARIANCE OF ESTIMATES " 


merely requires that a system of sufficient statistics exists for estimating 6; , 


62,°-:,67. The reason is that the condition (4.4.3) takes the equivalent form 
ah 

4.4.11 a> a 

( ) 00; . 


fori = 1, 2,--- , T identically in & , &,--+- , ty_r that is, that h is free of 
01, O02,°°:, Or. 


5. Miscellaneous. In §5.1—§5.3 we discuss certain properties of ¢;(n). In 
§5.4 we obtain conditions under which there exists no unbiased estimate of 6, 
having a finite variance, which is functionally dependent upon a given unbiased 
estimate 6* of 6. 

§5.1. Assume that there exists an “efficient” statistic 6*(%1, %2,°-: , tw) 
for estimating 8, in probability density function (or probability) 

Pw(21, t2,°-* , tw; 8). 
That is, 


(5.1.1) 6* (xy gas °* Zn) —-éd=K-.- oi(N) 


where K as usual may only depend on 6. We postulate as usual the existence 
of all partial derivatives of py of all orders and also of K up to the third order 
with 





dK 
(5.1.2) _—* 
Further we assume that 
pa Rios 
where 
N 
/ Ti(t1 , %2,°** , Ly) I dit. is finite for all 7 


Under the above assumptions we will show that 
do(N) = 1, di(N), 2(N), --- , O(N), --- 
form a set of orthogonal polynomials in ¢:(N) with respect to the weight function 
Py(t1,%2,°** , tw; 8). 
ProoF: We can easily see that 
(5.1.3) = dit — Oi Gi 
where ¢;(N) is shortened to ¢; for convenience. Differentiating (5.1.1) with 


respect to @, 


(5 1 4) 0¢, a 1 dK 1 


2 Kaw" K 





22 : G. R. SETH 


Let us designate 


‘ 1da‘K 
(5.1.5) a= K de 


for all integral values of 7. From (5.1.3) and (5.1.4), it follows that 
‘ 2 1 
(5.1.6) ¢: — $1 = —adi — 5. 
K 
Differentiating (5.1.6) further with regard to @ and using (5.1.3) and (5.1.6) we 
obtain 
. 2 
(5.1.7) $3 — dibs = —2td2 — ¢ + ) gi- 
Differentiating (5.1.7) with regard to 0, and using (5.1.2) we ge 


(5.1.8) os — ids = —321¢3 — (32 + ) ¢2- 


We assume generally that 


(5.1.9) din — Adi = —1nG; — (= . Y Ze + z) di-1- 





Differentiating (5.1.9), and employing (5.1.3), (5.1.3) and (5.1.9) we obtain 








2 K 


We know that (5.1.9) holds for 7 = 1, 2, 3; do being taken equal to one, and 
we have proved that if (5.1.9) is true for 7 = j, it is truefori = 7 +1. Thus 
by mathematical induction (5.1.9) holds good for all integral values of 7. 

It is also clear from (5.1.6) and (5.1.9) that ¢; can be expressed as a poly- 
nomial in ¢; of the ith degree, the coefficient of ¢{ being equal to unity. 

To complete the proof of our assertion we will now prove that 


GAs haha ~ ~C + Data - (= ee ‘) i. 





(5.1.11) E@:-¢;)=0, t+). 
From (5.1.9) 
(5.1.12) didi = Gin + tdi + (= . % Ze + i) Pi-1, 


where i is any positive integer. We multiply both sides of (5.1.12) by ¢: and 

reduce every product ¢:¢; to a linear combination of $;+1 , ¢; and ¢;-1 with the 

help of (5.1.12). _ Repeating this process 7 — 1 times (7 < 7) it follows that: 
2j—1 


(5.1.13) bib: = din; + x d,-bisj—u + dbj-O5-; 


where d’, are functions of K, z, and z. From (5.1.13), by taking expectations 
of both sides, 


(5.1.14) Ei - ¢:) = 0, (j <2). 





is t 


(5. 


as 





VARIANCE OF ESTIMATES 23 


Now, since ¢; is a polynomial of the jth degree in ¢; we conclude that (5.1.11) 
is true for all integral (positive) values of 7. 
Thus we obtain 


(5.1.15) @(N) = 1, @i(N), G&(N),---, O(N),-°-, 


as a set of orthogonal polynomials in ¢;(N), the weight function being 


Pr(%i,%2,°°*, tn; 8). 

Furthermore 
° 2i—3 . ° 
(5.1.16) o1 i = doa + dX di, deiz1—u + Adz 2°dy 
where 
(5.1.17) diz2 = [| B; 
7=2 

and 

rr: ow 
(5.1.18) B; = er ta + K° 
Hence 
(5.1.19) E(¢i-¢:) = I B;. 

Tas 


Thus if we divide ¢; by / I] B;, (5.1.15) becomes the orthonormal set. 
j=l 


Some cases, where we obtain ¢; as orthogonal polynomials, are given below, 


1 —i > (zi;—6)? ~ 
= 3 —v 7 ° 2i=1 ? = .~ ° 


i=l 
= 
1 3 $2 — ae 
2. Py = Zope € 781 fi = — wes 
(27) 26? 20 


N 
tt Zt 
3.pyv=6 -«(1—06) * (x; = 1 with prob. 6 

= 0 with prob. 1 — 6/7’ 


i=1 


a ai - 8 


N N 
:2 78 x xi — N@ 


—N6 


Il xj! 


i=] 


4, Pn = 


> 
I 


6 





24 G. R. SETH 


A; and B; , the coefficients of ¢; and ¢;_; respectively in (5.1.12) for the above 
four cases are given as below: 





A; B; 
1. 0 a-N 
; w(i—1) , iN 
2. 27/0 ze a op 
i(1 — 26) —i(i — 1) iN 
6(1 — 6) a(1 — 8) + 6(1 — @) 
4, 4/6 tN /0 


It may be mentioned that in all these cases {¢;} are also a complete set of 
polynomials. 


§5.2. Let >. K.(n), where K;(i = 1, 2, --- , m) depends upon @ be such that 
i=] 
> K#@:(n) and ¢;(n) satisfy the regularity conditions mentioned in §2.2. Then 
t=1 


we will show that >. K.#;(n) cannot be a function of 2; , 22, --+ , 2» alone except 
t=1 
for constant zero. 
Proor: Let us assume that >> K; - ¢,(n) is independent of 6, that is, it is 
t=] 


some statistic, say, 

(5.2.1) O*(21, %2, °**, Xn) = z K;-¢;(n). 
Taking expectations of both the sides, we obtain: 

(5.2.2) B24, 22, +--+, B4)) = Ly Ker Eon) = 0. 


Differentiating (5.2.2) 7 times with regard to @, we have, because of the regu- 
larity conditions on ¢;(n) and 6*(a, 42, -°-** , Xn), 
(5.2.3)  E(0*(a1, %2,---,%n) - o(n)] = 0, i = 1,2,---,m. 


It may be noted this is similar to the result in (2.3). From (5.2.3) and (5.2.1) 
it follows that 


(5.2.4) E[*(a1, 22 ,"*++ , tn)) = 0. 


Thus 6*(x; , %2,°** , tn) is zero with probability one, that is, 


a K;-¢i(n), 


if independent of 6, is zero with probability one. This proves our assertion 
that this cannot be a function of x1, 22, --- , %, alone except for constant zero. 
From the foregoing we deduce the following conclusions: 
I. ¢;(n) or any power of it cannot be a function of the observations free of 6. 





on 


(5 


wl 








VARIANCE OF ESTIMATES 25 


II. If a statistic 6*(x1 , 2, +++ , Za), Which is not a constant with probability 
one, can be put in the form 
(5.2.5) 6* (2x; 9 125 °°%*%, 7) = Ko + _ K;-¢(n), 
t=] 


where m is some finite positive integer, then 

(i) Ko must depend upon @, 

(ii) The expression (5.2.5) for 6*(a , x2, +++ , n) in ¢;(n) is unique. 

(iii) No other unbiased estimate of Ko satisfying the regularity conditions 
can be put in the form (5.2.5). 

(iv) When m = 1, there is no other statistic except a6* + b, where a and b 
are constants independent of @, which can be put in the above form 
Kyo + Ki - o:(n), Ko and K, are differentiable functions of 6 and K, does 
not vanish for any @ under consideration. 


(v) Let é be any function of 2 , 72, --- , 2, free of 0, satisfying the regularity 
conditions of §2.2 with E(é) = 0. Since the covariance between & and 
6*(21, , 22, °°, tn) in (5.2.5) is equal to zero, the statistic of the form 


(5.2.5) has the least variance of all unbiased estimates of Ko that satisfy 
the regularity conditions of §2.2. 
Also, if the probability density or the probability function depends on more 
than one parameter, then all the above results except (iv) hold good if 


> K-93: (n) 
is replaced by 
Ki, is oe a ee ip(n). 


tytiot- -o+igsm 


§5.3. Let us now prove the assertion made in (iv) of §5.2, when m is equal to 
one. 


Suppose the contrary that there is a statistic 0,*(a - 2, +++ , ,) which is of 
the form 
(5.3.1) O5(a1, 2, °** y tn) = Lo + Ly - di(n). 
6*(a,, %2,°**, Ln), Of course, has the form 
(5.3.2) O*(2, , 22, +++ ,%n) = Ko + Ki: Gy(n). 


We will assume Ky, Ki, Ly, Li to be differentiable functions of 6 and that 
K, , I, do not vanish for values of @ under consideration. 

Differentiating, with respect to 6, the expressions in (5.3.1) and (5.3.2 , we 
have 


dle . a 

(5.3.3) = + = - dr + Lilge — $3) = 0; 
IK, . dK . 

(5.3.4) + ‘52 + b1 + Kilds — @1) = 0, 








26 G. R. SETH 


where ¢; is short for ¢;(n). Taking the expectations of the above and rearrang- 
ing, it follows that 


| dL ae ] dKo 
Ly dé ky dé : 
From (5.3.3) to (5.3.5), we deduce that 


(5.3.5) E(¢;) = 


— 1 dL 1 dK 
(5.3.6) i, a =x = 
Now solving the above differential equation, we get 
(5.3.7) I, = ak, 
where a is a constant independent of @. From (5.3.5) and (5.3.7) it follows that 
(5.3.8) Ly = aKy + Bb, 


where b is a constant independent of 6. From (5.3.7) and (5.3.8) we conclude © 
that the statistic in (5.3.1) must be of the form a 6* + b, which proves our asser- 
tion. An immediate consequence is that if there exists an efficient statistic for 
estimating y(@), then no other function of 6 except a y(@) + b can have an effi- 
cient estimate.” 
$5.4. If 0*(21 , 2, +--+ , @n) is an unbiased estimate of @ satisfying the follow- 
ing conditions: 
(i) Among all unbiased estimates of 6 having finite variances, which are also 
functions of 6*, 6* is one with the least variance, 
(ii) For all @ there exists a complete set of polynomials with respect to the dis- 
tribution function of 6*, then there exists no unbiased estimate of 6 with 
a variance, which is functionally dependent upon 6*, except 6* itself. 
Proor: Let 6* be the unbiased estimate of @ which has the least variance 
among all unbiased estimates of 6 which are functions of 6*. Further let S(6*) 
be any function of 6*, free of @, whose expectation exists and is equal to zero. 
Let the variance of S(6*) be finite. It is well known that for any such S(6*) 


(5.4.1) E(6*S(6*)) = 0. 
Now 6*S(6*) in turn having expectation equal to zero, we obtain 
(5.4.2) E(6*’S(6*)) = 0. 


Repeating the above 7 times we obtain, in general, that 


(5.4.3) E(6*'S(6*)) = 0 
. diy ins 
3 We assume the existence of at (@ = 1,2) and pe) for all 6, and also postulate that 
P ( 
dy(0) 





2 : , ‘ 
and E(¢;) do not vanish for any @ under consideration. 


dé 





VARIANCE OF ESTIMATES 27 


for all positive integers 7. From the above, with the help of condition (11), 
we conclude that S(@*) must be equal to zero. Thus if H(6*) is an unbiased 
estimate of @ with finite variance, then from above, H(6*) — 6*, having the ex- 
pectation zero and a finite variance, must be zero with probability one. Thus 
H(6*) is the same as 6*, which proves the result. 

{XAMPLE. If 6* is of the form (5.2.7) and condition (ii) is satisfied, then 
there is no function of 6*, free of @ and having a finite variance, whose expec- 
tation is Ko. 

Conditions (i) and (ii) above are satisfied for estimating @ in the examples 
quoted at the end of the section 5.1, and thus in these cases the result holds 
good when 6* is the efficient. estimate. 

I am highly thankful to Professor J. Wolfowitz for his guidance and help in 
this research. 


REFERENCES 


{i] H. Cram&r, Mathematical Methods of Statistics, Princeton Univ. Press, 1946, p. 480. 

{2} C. R. Rao, ‘Information and the accuracy attainable in the estimation of statistical 
parameters,’’ Calcutta Math. Soc. Bull., September, 1945. 

[3] A. BuaTrracHaryya, ‘On some analogues of the amount of information and their use 
in statistical estimation,’? Sankhyd, Vol. 8 (1946); also ‘‘On some analogues of the 
amount of information and their use in statistical estimation,’’ Sankhyd, Vol. 8 
(1947). 

[4] H. Cramér, “Contributions to the theory of statistical estimation,’’ Skandinavisk 
Aktuar. tids., Vol. 29 (1946), pp5-94. 

[5] J. Wo_rowitTz, ‘‘Efficiency of sequential estimates,’’ Annals of Math. Stat., Vol. 18 (1947). 

{6] BLACKWELL AND Girsuick, ‘“‘A lower bound for the variance of some unbiased sequen- 
tial estimates,’ Annals of Math. Stat., Vol. 18 (1947). 

[7] B. O. Koorman, “On distributions admitting a sufficient statistic,’? Am. Math. Soc. 
Trans., Vol. 39 (1936), p. 399. 


& 





ON THE THEORY OF SOME NON-PARAMETRIC HYPOTHESES 
By E. L. LEHMANN AND C. STEIN 


University of California, Berkeley 


Summary. For two types of non-parametric hypotheses optimum tests 
are derived against certain classes of alternatives. The two kinds of hypotheses 
are related and may be illustrated by the following example: (1) The joint 
distribution of the variables X,,---,Xm,Yi,-°°:, Yn is invariant under all 
permutations of the variables; (2) the variables are independently and identically 
distributed. It is shown that the theory of optimum tests for hypotheses of the 
first kind is the same as that of optimum similar tests for hypotheses of the 
second kind. Most powerful tests are obtained against arbitrary simple alterna- 
tives, and in a number of important cases most stringent tests are derived 
against certain composite alternatives. For the example (1), if the distributions 
are restricted to probability densities, Pitman’s test based on 7 — Z is most 
powerful against the alternatives that the X’s and Y’s are independently normally 
distributed with common variance, and that E(X;) = & E(Y;) = 7 where 
n> & If — & may be positive or negative the test based on | 7 — #| is most 
stringent. The definitions are sufficiently general that the theory applies to 
both continuous and discrete problems, and that tied observations present no 
difficulties. It is shown that continuous and discrete problems may be com- 
bined. Pitman’s test for example, when applied to certain discrete problems, 
coincides with Fisher’s exact test, and when m = n the test based on | 7 — #| is 
most stringent for hypothesis (1) against a broaa class of alternatives which 
includes both discrete and absolutely continuous distributions. 


1. Generalities. In the present paper we study the problem of determining 
optimum tests for certain non-parametric hypotheses. It is important in this 
connection to make some distinctions which are of lesser significance when the 
problem is approached from the intuiti, ° point of view which has been customary 
in this field. Consider for example the hypothesis H that Z,,--- , Zw are 
independently and identically distributed according to an unknown probability 
density function. All tests which have been suggested for testing H are valid 
also for testing the hypothesis H’ that the unknown joint probability density 
function of the Z’s is symmetric in its N arguments. On the other hand, tests 
which have optimum properties for testing H’ against a certain class of alterna- 
tives will in general not possess the same properties when H’ is replaced by H. 
From the present point of view the two hypotheses mentioned are essentially 
different. We shall be concerned in this paper primarily with generalizations 
of H’, and we shall show that many of the tests suggested in the literature have 
optimum properties for testing hypotheses of this kind against certain classes of 
alternatives. 

The corresponding general theory for hypotheses related to H is quite different. 


28 














NON-PARAMETRIC HYPOTHESES 29 


However the two theories do coincide, provided tests of these latter hypotheses 
are restricted to similar regions. More specifically, all results on optimum 
tests of H’ are equivalent to the corresponding results on optimum similar tests 
of H, and this equivalence holds also for many of the more general hypotheses 
considered in this paper. 

It should be observed that in many experimental situations, the hypothesis 
H’ that the joint distribution of the Z’s is invariant under all permutations is 
more realistic than the hypothesis H that the Z’s are independently and identi- 
cally distributed. For example, suppose there is a block of land divided into 
m + n plots, and the experimenter wants to test whether one of two fertilizers 
(used in fixed amounts) is more effective than the other in increasing the yield 
of a certain plant. Of the plots, m are chosen at random; fertilizer I is applied 
to these, and fertilizer II to the other n. If X; denotes the yield from the 7th 
plot to which fertilizer I has been applied and Y; denotes the yield from the jth 
plot to which fertilizer II has been applied, where the plots are numbered at 
random, then the hypothesis that the two fertilizers are completely equivalent 
implies that the application of any permutation to X1,---,Xm,Yi,-°-: Yn 
does not change their joint distribution. But it is not reasonable to suppose the 
X;, Y; are independently and identically distributed, since there may be intrinsic 
differences among the plots. For discussions of these and related points, see 
Fisher [1], Neyman [2], Pitman [3]. It may be that in many particular cases 
some hypothesis between the two is really appropriate but the hypothesis H is the 
only one that is evidently appropriate from a cursory inspection of the setup. 

Many of the alternative hypotheses considered below, for example those 
involving normality, are dictated more by tradition and ease of treatment than 
by appropriateness in actual experiments. Thus this paper should not be 
considered as providing absolute justification for tests such as Pitman’s but 
rather as suggesting a method of obtaining optimum non-parametric tests when 
the class of alternatives is fairly well specified. 

Another possibility, first raised by Neyman [2], which has been ignored in this 
paper is the equality on the average of the two fertilizers but with fertilizer I 
having a larger dispersion than fertilizer II, or a distribution differing in some 
other characteristic. It would be reasonable to consider this as part of the 
hypothesis tested, but tests based on randomization may give a probability of 
rejection of the hypothesis of equivalence in this case which is much higher than 
the stated level of significance. We hope to return to problems of this type in 
later papers. 

Let us make the following basic assumptions. Z is a space of points z and @ 
is an additive class of subsets A of 3. Any member of @ will be said to be 
measurable. By a probability distribution we mean a measure F, defined over 
@ for which F(Z) = 1. We shall be concerned with two classes of probability 
distributions: One, the class of all distributions, and two, the class of distribu- 
tions which are absolutely continuous with respect to a given measure uy, that is, 
the class of distributions F for which there exists a function f such that 





30 E. L. LEHMANN AND C. STEIN 


(1.1) F(A) = | f@) dul. 


We shall call f a generalized probability density function with respect tou. By 
Z we denote a random variable such that for any A in @, 


(1.2) P{Z eA} = F(A). 


For most of the applications we shall take S to be a Euclidean space, and @ 
to be the class of all Borel sets. Then if » is Lebesgue measure, (1.1) states that 
fis a probability density function in the usual sense. However, we shall have 
occasion to consider also some measures other than Lebesgue measure. By a 
hypothesis H we mean a class of probability distributions. Next we describe 
the hypotheses with which we shall be concerned. Let II be a partition of 2, 
that is, let II be a class of mutually exclusive subsets S of 3 such that every 
point z of 3 lies in one of the sets S._ If two points 2; and z lie in the same set S, 
we shall say that z; is equivalent to z2 with respect to II: 2: ~ z. (mod 11). The 
set of all points which are equivalent to z will be denoted by T(z), the number of 
points of T(z) by n(z). Concerning II we make the following assumptions: 

(i) All sets in II are finite, so that n(z) is finite for all z. 

(ii) If we define S, as the union of all those sets S of II which contain exactly n 
points, there exist mutually exclusive sets S$”, --- , SS” which are measurable 
and such that every element S of II containing exactly n points has one and 
only one point in common with each S{”. 

We shall say that a measure yu is invariant under II if the following condition 
holds: For all n and i, j < n, if S is any set contained in SS’ and if S’ denotes 
the set of equivalent points in S{”, then u(S) = u(S’). 

Given a partition II satisfying (i) and (ii), we formulate the hypothesis H 
that the distribution F of Z is invariant under II. We shall refer to H as the 
hypothesis of invariance under II. We shall also consider the hypothesis of 
invariance under a partition for a class of generalized densities f. In this case 
we assume that the measure yu of (1.1) is given, and that II, in addition to (i) 
and (ii) satisfies the condition: 

(iii) The measure yw is invariant under II. The hypothesis H in this case 
states that 2, ~ 2: (mod II) implies f(z) = f(z). 

By a test of a hypothesis H we mean (see [4]) a measurable function g on 3 
to the interval [0, 1] which with every point z e€ 3 associates a probability ¢(z) 
of rejection. This definition, slightly more general than the usual one, is 
particularly useful in non-parametric work. Among other advantages it 
automatically takes care of the problem of tied observations. It also disposes 
of the difficulties encountered by Scheffé [5] in his treatment of the problem of 
similar regions, as will be shown in Lemma 1. 

The size of a test ¢ is def:ned to be 


(1.3) e(e) = sup [ o@) aF@). 





If 


(i 





NON-PARAMETRIC HYPOTHESES 31 


If in particular 


(1.4) [ ear = &) 


for all F in H, ¢ is said to be similar for testing H. Extending the terminology of 
Scheffé, we say that ¢ has structure S(e) if for all z in S, 
(1.5) > lz’) = ne. 
z’eT(z) 

The following lemma extends a result of Scheffé. 

Lemma 1. For testing a hypothesis of invariance, any test of structure S(e) 
is similar and of size e. 

Proor. For any F in H and any ¢ 


as) feaw= Uf car=L/ Ld eae. 
n=1 i=l YSn n=1 YSy 2’eT (2) 


But ¢ has structure S(e) and hence (1.5) holds for all z. Therefore 


(1.7) [ ear = X ne f ea 
Sn 


n=1 


We shall show next that for testing a hypothesis of invariance at level of 
significance e, only tests of structure S(e) need be considered. In order to make 
this result applicable both to hypotheses referring to the class of all distributions 
and to those referring to a class of generalized densities, we shall state it in an 
asymmetric form which when taken together with lemma 1 indicates the essential 
equivalence of the two types of hypotheses. 

Lemma 2. If gis any test of a hypothesis of invariance for the class of generalized 
densities with respect to a fixed measure pu, and if the size of ¢ is less than or equal to e, 
then there exists a test gi of structure S(€) such that 


(1.8) [eo dF > [ear 


for all probability distributions F. 
Proor. First we shall show that 
1 
n(z) 2’eT (2) 


(1.9) g(2') Se 


almost everywhere uw. For let A be the set of points z such that 
1 
n(z) ae 


and suppose that u(A)is positive. Let 





(1.10) g(z') >e 





u(A) 


0 elsewhere. 


] . - sem 
(1.11) f(z) = 
l 








32 E. L. LEHMANN AND C. STEIN 


Then f is in H since by definition of A, whenever z is in A, T(z) is contained 
in A. But 


(1.12) | efau >« [ ta “* 


in contradiction to the assumption that ¢ has size e. 
From (1.9) it follows easily that there exists a test ¢, of structure S(e) and 
such that for all z 


(1.13) gi(z) > (2). 


Since condition (1.8) is then satisfied, this completes the proof. 

Lemma 2 raises the question whether it is possible to reduce the problem of 
testing a hypothesis of invariance still further, or whether the tests of structure 
S(e) form, what Wald [6] has called an essentially complete class of admissible 
tests. This question is answered by 

THEOREM 1. Let yu be a measure defined over @. Let My and Tl, be two partitions 
of S satisfying conditions (i), (ii) and (iii), and such that z ~ z’ (mod Il) implies 
z~2' (mod I). For the class of generalized densities with respect to u denote by 
H; (t = 0, 1) the hypothesis of invariance relative to T1;. Then for testing Ho 
against H, at level of significance e, the totality of tests which (a) have structure 
S(e), and for which (b) z ~ 2’ (mod Il,) implies g(z) = ¢(z’), form an essentially 
complete class of admissible tests. 

Proor. It is easily seen that we can restrict ourselves to that subclass of 
tests of structure S(¢) which possess property (b). For if ¢ is any test of struc- 
ture S(e) relative to Th , let 

1 
* = 
(1.14) o*(z) WD ie 


Then clearly g* possesses property (b) and has structure S(e). Furthermore if f 
is any probability density function of H; , then 


y(z). 


(1.15) [etau= [of au, 


so that ¢ and ¢* have identical power against Hi; . 

In order to complete the proof, we must show that if g; and ge are any two tests 
satisfying (a) and (b), and if g and g» differ on a set of positive measure, there 
exists a probability density function f of H; for which 


(1.16) fotau> f efdu. 


Since both ¢; and ¢2 have structure S(e), the set A of points z for which 
(1.17) gi(z) > ¢e(z) 


has positive measure. Also, because of (b), if two points are equivalent relative 
to II, , they are either both in A or both not in A. If f(z) is defined as 1/u(A) 
for z in A and as zero elsewhere, then f is in H, and satisfies (1.16). 





NON-PARAMETRIC HYPOTHESES 33 


The theorem obtained from theorem 1 by letting the hypotheses Hy and H; 
refer to the class of all probability distributions rather than to a particular class 
of generalized densities, is clearly also true, and cases between these two theorems 
could also be formulated. 

Since the most powerful test ¢ for testing a hypothesis of invariance Ho 
referring to a class of generalized densities against an alternative f from this 
class of densities has the correct size also for testing the wider hypothesis Ho 
referring to the class of all distributions, ¢ is also most powerful for testing Ho 
against f. The corresponding remark holds for most stringent tests. Therefore 
all optimum tests that will be derived in the sequel, through the use of theorems 
of this section, may be considered as tests of hypotheses referring to the class 
of all distributions: they are valid against these hypotheses, and no power is 


gained by restricting the hypothesis to the appropriate classof generalized 
densities. 


2. Most powerful tests and most stringent tests. One of the main problems 
to be considered in this paper is the determination of a most powerful test of a 
hypothesis of invariance against a simple alternative. If we restrict our con- 
siderations to the class of generalized densities with respect to u, a complete 
solution of this problem is given by the following 

THEOREM 2. Let H be the hypothesis of invariance under the partition Il, and 
let g be a probability density function not in H. For any z in S, denote by 2, --- , 
z'” the n points of T(z) arranged so that g(z) > g(z™) > +--+ > g(2). For 
testing H against g a most powerful test of size € is given by 


1 if g(2) > gre) | 
(2.1) gz) = 4a if gz) =gle""")> forzinS,, 
0 if g(z) < g(22t le") 


where 7 g(z"”) = ne, 0 < a < 1 and where a may depend on z through T(z). 


i=] 

Proor. First we observe that the number of 2“ for which g(z) > g(z°*"*"”) 
is greater than or equal to 1+[en] > en and that the number of z” for which 
g(z) > g(z"*"*") is less than or equal to [en] < en, so that there exists an a 
between 0 and 1 for which Yy(z‘”) = ne. Since ¢ has structure S(e), it follows 
from lemma | that it is similar and of size e. . 


Let 
(2.2) g*(z) = are) for zeSn,. 
To complete the proof consider first the special case that 
(23) [ o*@ ante) 


vanishes. Then 


(2.4) [cou = [gau=1 








34 E. L. LEHMANN AND C. STEIN 


that is, the test ¢ has power 1, and therefore is clearly most powerful. Assume 
next that the integral (2.3) is positive. Then g* is proportional to a probability 
density function of H. For it is measurable and satisfies the symmetry condition 
required of a member of H, and the integral (2.3) is finite since 


1 [en]+1 " 
Lf v@e@scof LY oe) ue 


(2.5) , 
1 - a) 1 

< -> | — D1) g(z) du(z) = =. 

E€ xn S, 1 i=1 : € 


The test ¢ therefore has the form of a probability ratio test. Since it is also 
similar, it follows from theorem 1 of [4] that g is most powerful. 

In practice one is usually interested in composite rather than simple alterna- 
tives. We shall therefore consider next the problem of deriving most stringent 
tests of hypotheses of invariance against certain classes of alternatives. This 
problem may be reduced to that of finding tests which maximize the minimum 
power over a class of alternatives by the following simple theorem of Hunt and 
Stein [7]. 

THEOREM 3. Given a hypothesis H and a class of alternatives {go}, 0 € 2, denote 
by 8*(0) the envelope power function corresponding to the level of significance e, 
that is, let 


(2.6) 6*(6) = sup A, 4) 


where B(y, 9) stands for the power of the test » against the alternative gg and where 
the least upper bound is taken over all tests y of size «. Let {Qs} be a class of mutu- 
ally exclusive subsets of 2 such that UQ; = Q and such that B*(0) is constant on 
each 23. Denote by gs a test which maximizes the minimum power over Q;. If 
gs = y is independent of 6, then ¢ is most stringent’ for testing H against Q at level of 
significance e. 

For obtaining tests which maximize the minimum power over a class of 
alternatives to a hypothesis of invariance, we can state the following simple 
extension of theorem 2. 

THEOREM 4. Let H be a hypothesis of invariance, and let H, be the class of 
alternatives {go}, @¢€Q. Suppose there exists a subset Q’ of Q and a probability 
measure d over 2 such that for the testy of size « defined as in theorem 2 with 


(2.7) (2) = | gle) ano), 
the integral [ ¢ge du is constant for 6 in Q’, and 
v 


(2.8) | du > | ogo du forall 0€2,60 €%. 


Then ¢ maximizes the minimum power over Q at level of significance e. 


1 A test is said to be most stringent [16] if it minimizes the maximum difference between 
envelope power and power, that is, if it minimizes Sup [8*(6@) — B(¢, @)]. 





NON-PARAMETRIC HYPOTHESES 35 


Proor. By theorem 2, ¢ is a most powerful test for testing H against g, 
that is, for any ¢’ of size e 


29) fe@[ we aw ae < fo [ale aro due. 


Consequently 


int | e'@gu(e) dul) < [ aro) [ ogee) auto 
(2.10) -[e@ue@ f a@aos [© ue | a@ ae 
= [aX | Gale) due = int | e@g(@ due) 


3. Normal alternatives. Let H be the hypothesis of invariance under II, let 
T(z) be the set of points equivalent to z (mod II), and let f and g be two functions 
defined over 3. We shall write f ~ g if there exists a function F such that 


(3.1) f(@) = Fig(z), Tt), 


where for any fixed T(z), F is a strictly increasing function of g. We note that 
f ~ gq in the following two special cases: 

(i) f(z) = F{g(z)] where F is strictly increasing; 

(ii) f(z) = a(z)g(z) + b(z) where a(z) > 0 for all z, and where z; ~ 22 (mod II) 
implies a(z:) = a(z2), b(21) = b(z). 
The usefulness of this notation stems from the following remark. Let g* and 
y be defined as in (2.2) and (2.1) respectively and let f ~ g. If the test y is 
obtained from ¢ by substituting f and f* for g and g* respectively, then y = ¢. 

The purpose of the present section is to obtain most powerful and most 
stringent tests of some hypotheses of invariance against certain classes of normal 
alternatives. In particular, problems will be exhibited for which various 
non-parametric tests suggested in the literature possess these optimum properties. 

PrRoBLEM 1. Suppose that the random variables Z;; (fj = 1,°::, 8; 
1 = 1, --- , m) have a joint probability density function, and denote by H the 
hypothesis that this probability density is invariant under all permutations 
of the s; arguments within the 7th group for 7-= 1,---,m. Consider the 
alternative H, that all variables are independently distributed with common 
variance o°, and that 


(3.2) E(Z;;) = aXi; + b; ; 
where a, the b’s and the x’s are assumed known and where, without essential 


loss of generality, we assume a > 0. Assume further that 


8i 


(3.3) 7 ri; = 0. 


yt 








36 E. L. LEHMANN AND C. STEIN 


In order to obtain the most powerful test of H against H; , we apply theorem 2 
with 





- ‘i oe 1 NE 5 ae 
(3.4) g(z) = c exp ! 53 TD(2;; — axi; — b,) | 
~ Lr (az;; + b:)2i; as L225; 2:5; ‘ 
The most powerful test is therefore given by (2.1), if we replace g(z) by 222; z:;. 
This test being independent of o’, the b’s and a > 0, it is uniformly most powerful 
against the class of alternatives obtained from H, by not specifying the values of 
these parameters but restricting a to be positive. 

If we drop the restriction a > 0, a uniformly most powerful test no longer 
exists; we shall instead obtain the most stringent test against this extended class 
of alternatives, using theorems 3 and 4. Clearly the envelope power function is 
constant on the surfaces | a|/c* = constant. Take as the Q of theorem 4, the 
set consisting of the two points (a, b1,--- bm, a) and (—a, hi, --: ,bn,¢). 
Let \ assign the probability 3 to each of the two points. Then the function g 
of (2.7) becomes 


OF tae ) ex ele sn wae om +3( Jes )" 
2\~V/2re *P _- 7 . 2 / 2x0 


1 
(3.5) exp { 53 Lr(2:; + ari; — oa 





~ exp{22z;;(ax;; + bi)} + exp{Zrz;;(—azxi; + b,)} 
Pa exp {2Laz;;2:;} + exp { —LLax;;2;;} ~ | LIz;;2;; | 
The power of the test ¢ obtained by substituting this expression for g in (2.1) 
is the same at both points of 2. For this test is most powerful for testing H 
against the simple alternatives H’ that the density of the Z’s is given by the first 
member of (3.5). But under the transformation Z;; = —Z:; + 2b;, H and H’ 
and therefore the test ¢ are left invariant, while the two points of 2 are permuted. 
Condition (2.8) of theorem 4 is therefore satisfied, and hence g maximizes the 
minimum power over 2. Since furthermore ¢ is independent of the particular 
set 2 chosen, it follows from theorem 3 that ¢ is most stringent for the problem 
under consideration. In case condition (3.3) is not satisfied, let x;; = xi; — 2%. . 
Then, 222i; = Oand E(Z;;) = axij.+ bi. 
Therefore the test criterion (3.5) becomes 


(3.6) | D2z;;(x;; = x.) | = | SZ (z:; cas 2.) (wi; = 2.) | ° 


Some special cases of problem 1 are of particular interest. 

a) Suppose that the variables of the 7th group fall into two subgroups, and 
write for Z;;: Ui; when j = 1, --- , ki; Vij; when j = fi + 1,::: 
k, + (ki + 1, = 8:). Let 

(0 for jm=i,--- ki; 
(3.7) viz = 4 
\1 for =k;+ 1, pk +l 


? 





NON-PARAMETRIC HYPOTHESES 37 


Then the alternatives ascribe to the variables normal distributions with common 
variance and such that 


(3.8) E(U;;) = b ; E(V;;) = b; +a. 


The criterion becomes 





- k; l; 1 
(3.9) Lrz;; (xi; —_ xi) = o( 203; _— —— Sus) = 2% (v; <a ui) 





imi \ki + 1 Atl = 4 ; 
or 
: : (v; — ui) | 
(3.10) ie? — 
| ke U, | 


according as a is restricted to positive values or not. 

b) If we specialize still further and let m = 1, we are dealing with a problem 
which would coincide with the two sample problem if we added independence 
to the assumptions of the hypothesis. (3.10) becomes | — @|, the criterion 
suggested by Pitman [3]. 

c) If instead of m we set k; = 1; = 1 fori = 1, --- , m we are testing inter- 
changeability within each pair (u; , v;) against normal alternatives under which 
the means of U; and V; are different, the difference being independent of 7. 
The criterion | = (v; — u;)| to which (3.10) reduces was first suggested by 
R. A. Fisher [1]. 

d) Asa last example set m = 1 in the original problem. Under the hypothesis 
the joint density of Z, , --- , Z, is symmetric in its s arguments, while under the 
alternatives the Z’s are normally distributed with common variance and mean 
ax; +b. The criterion reduces to | = (2; — Z)(x; — #) | which was proposed by 
Pitman [3]. 

We therefore see that several non-parametric tests which have been discussed 
in the literature are most powerful one-sided or most stringent for testing a 
hypothesis of invariance against certain classes of normal alternatives. In a 
later section we shall indicate to what extent these results remain valid if to 
these hypotheses we add the assumption of independence. 

The remaining problems will be considered somewhat more briefly since the 
proofs follow the same pattern as in problem 1. 

ProsLemM 2. The conditions of problem 1d) are satisfied in particular if 
%1,°°* , 2, are values taken on by random variables X,, --- , X, and if under 
the alternatives the pairs (X; , Z;) have a common bivariate normal distribution 
with ox = 02. Weare then concerned with a problem related to that of testing 
for absence of interclass correlation. For the corresponding intraclass problem, 
we consider random variables X; , --- ,X;,Z1,°:: , Zs , and test the hypothesis 
that the joint density of the 2s variables is symmetric in all its arguments, 
against the alternatives that the pairs (X; , Z;) have a common bivariate normal 
distribution, the means and variances of the X’s and Z’s being the same. We 





38 E. L. LEHMANN AND C. STEIN 


shall only consider the case of positive correlation. Clearly, the criterion will be 
> x2; as in the one sided case of problem (d). However the tests differ, in that 
this expression must now be compared not only with the s! expressions obtained 
by permuting the 2’s among themselves, but instead with the (2s)!/2°s! expres- 
sions obtained by considering all possible ways in which s pairs can be formed 
from the complete set of 2s observations. 

ProBLEeM 3. Consider once more the hypothesis that the joint density of 
Z:,°°-,Z, is symmetric in its n arguments, and consider the alternatives that 
the Z’s are normally distributed with positive circular serial correlation. Then 


@.11) g@) = Coxpy— gy Elles —  ~ slein — OF} ~ Deven 


where 2n4:1 = 21. The test based on this criterion, which was proposed by 
Wald and Wolfowitz [8], is therefore most powerful against the above class of 
alternatives. 

PROBLEM 4. Asa last problem, we shall test the hypothesis H that the joint 


density of Z,,--- ,Z, is symmetric in its n arguments and symmetric about 

each coordinate hyperplane, that is, invariant under the transformation 
, / . . . r ° . . 

xi = —2,,2; = x;forallj ~72,for? = 1,---,n. This will be tested against 


the alternatives that the Z’s are independently, identically distributed according 
to a normal distribution with non-zero mean. If we restrict this mean to positive 
values, we get 


1 r 4 . 
(3.12) g(z) = (\/anc)” exp :” 20° Y(z2; — p*| ~wrZ;. 


If on the other hand both positive and negative values are allowed for the mean, 
the most stringent test is based on the statistic | = 2; |. 

This test may be appropriate for some situations in which it is customary to 
use the sign test. 


4. Binomial and other non-normal alternatives. In the present section 
we shall be concerned mainly with generalisations of problems 1b) and Ic) of 
section 3. As described there, the hypotheses referred to the class of all proba- 
bility densities in the usual sense. However, as was pointed out at the end of 
section 2, the same tests may be considered as referring to much wider hypothe- 
ses. If they are interpreted in this way, it is possible to greatly widen the class 
of alternatives without destroying the optimum properties of the tests. 

Let Z = (X%1,-°--,Xn,¥i,-++,Y¥n) and denote by II the partition under 
which two points z and 2’ are equivalent if they are obtainable from each other 
by a permutation of coordinates. Let Ho be the hypothesis of invariance under 
II. This is a generalization of the hypothesis of complete symmetry referring 
to a class of probability densities. Consider as alternative the class of distribu- 
tions defined by 


(4.1) P{ZeA} = / C exp {0 2a; + 6:2y; + =r(z,) + =r(ys)} du(e). 
A 





——— 





NESE 


NON-PARAMETRIC HYPOTHESES 39 


where the 6’s are any real numbers, where yu is the 2nth power of any one dimen- 
sional measure »v (and therefore invariant under II), and where r is any v-measur- 
able function, subject only to the condition that the integral (4.1) converges 
when taken over the whole space. 

We first consider the one-sided case @ > @,. Using theorem 2 for a particular 
6, , 02, r and p, we then have 


g(z) = C exp {0 22; + &Zy; + Ur(zx:) +2r(y,)} 
(4.2) ~ O5r; + Oly; ~ 02x; + Ody; — (0, + O)E[x; + yd 


= 3(6, — 6)[Za; — Lyi] ~ Ly; — =z;. 
Since this test does not depend on 6; , 62, 7 or yw, it is uniformly most powerful 
against the one-sided class of alternatives @: > 6; . 

Dropping the restriction 6. > 6, , we apply theorem 4 with © the set consisting 
of the two points 6; , 6,7, wand @,6:,7, u. At these two points the envelope 
power fuiction obviously takes on the same value. If for A we select the 
distribution, which assigns equal probabilities to both points, then 


g(z) ~ exp {0,22; + @:Zy;} + exp {6.2x; + @Zy;} 
~ exp {3(@. — 6:)[Zx; — Zyi]} + exp {32 — @)[Ex; — Zyl} 
~ | 2a, — ty: |~|9 — Z|. 


The power of this test clearly is the same against both points of 2. Since 
furthermore the test does not depend on the 6’s, r, or u, it is most stringent 
against H,. 

A univariate distribution such that 


(4.4) P{X¢A} = / Cexp {ér + r(z)} do(z) 


has been called Laplacian by Tweedie [9], who has studied these distributions 
in a different connection. Among others, the normal and x’, the binomial and 
Poisson distributions are Laplacian. To obtain, for example, the distribution 
of a characteristic variable, take for v the measure v* which assigns to a set D 
the values 0. 1 or 2 according as D contains none, one or both of the points 
x = 0 and zx = 1, and take as density the function 


(4.5) p (1 -- p)* =(1— pe log(p/1—p) 


For comparison with tests which have been considered in the literature, one 
can specialize the problem just considered, so that the hypothesis Hy and the 
class of alternatives HM, consist only of those members of Hy and H; which are 
generalized densities with respect to a fixed measure uy. One can specialize even 
further and take as alternative any subset of H; provided with any point 6; , 62, 7, 
it also contains the point 6, 6,7. The test clearly will not change with these 
specializations, and the test based on (4.3) will therefore possess the same 





40 E. L. LEX,Y'MANN AND C. STEIN 


optimum properties with respect to these special problems as with respect to 
the problem for which it was originally derived. 

If in particular one selects for v the measure »* mentioned above, one obtains 
the problem for which R. A. Fisher proposed the test based on (4.3). It follows 
that this test, Fisher’s exact test, is most stringent in connection with the 
following problem: The random variables X,,---,X,n,Y1,°-:,Yn are 
characteristic variables, that is, they can take on only the valuesOand 1. If we 
let (4.6) P{X, = 1,-°°-, Yn = yn} = P(ti,--+- , Yn), the hypothesis states 
that the function P is invariant under ali permutations of its arguments. An 
equivalent formulation is that the probability (4.6) depends only on 22; + Dy, 
the total number of “‘successes’”’. Fisher’s exact test is most stringent against 
the alternative that the X’s and Y’s are samples from two distinct populations of 
characteristic variables, that is, two populations corresponding to distinct 
probabilities of success. 

Problem ic) of section 3 can be extended quite analogously. Put again 
Z = (X1,°-: ,Xn,¥1,°°*, Yn), and denote by II the partition under which 
two points z and z’ are equivalent provided they can be obtained from each other 
by a permutation of coordinates in which only the coordinates within pairs 
(X;, Y:) are interchanged. Consider the hypothesis of invariance under II 
with reference to the class of all distributions and as alternative the class of 
distributions given by 


(4.7) P{ZeA} = [ C exp 1a [Axi + Oy: + r(zi, vi} dp (z). 
The 6’s here are any real numbers, u is the 2nth power of any one-dimensional 
measure v, and r is any v-measurable function such that (a) the integral (4.7) 
converges when A is the whole space, and such that (b) r(z, y) = r(y, 2). 

Clearly in the one-sided case 62 > 6; we will again find g(z) ~ = y; — 2 xi~7—ZF, 
so that the associated test is uniformly most powerful against this one-sided 
class of alternatives, while the test based on | 7 — Z| is again most stringent 
against the full alternative M, . 

The class of distributions (4.7) contains the distributions (4.1) as a special 
ease. If (X;, Y:)2 = 1, --- ,n isa sample from a bivariate normal distribution 
with ox = oy , we get another case of (4.7). 

As a last somewhat more special problem we mention a discrete analogue of 
problem 4 of section 3. Let Z = (Z,,---,2Z,) and consider the class of 
generalized densities given by 


(4.8) P{Z Al - | Pa, , 2a) dul?) 


where yu is the nth power of v*. Let Ho be the hypothesis that P is invariant 
under permutations of the coordinates and under the group generated by the 
transformations z; = 1 — 2, 2; = 2;j # ifori = 1,---,n. This is an 
extension of the hypothesis that the probability of success in a binomial dis- 








NON-PARAMETRIC HYPOTHESES 41 


tribution equals 3. The test of Ho against the alternatives that Zi, --- Zn 
is a sample of a characteristic variable is based on = z; or | = z;| as P{Z; = 1} is 
restricted to be greater than 3 or is not so restricted. In the first case the test is 
most powerful, in the second most stringent. 


5. Hypotheses of iavariance for independent variables. To the results »b- 
tained so far, a different interpretation can be given, which throws some light on 
certain related problems. Theorem 2 gave sufficient conditions for a test to 
be most powerful against a simple alternative H, for the hypothesis Ho of 
invariance under a partition II. However, if taken in conjunction with section 
1, the theorem can be intepreted as giving sufficient conditions for a test to be 
the most powerful test of structure S(e) with respect to II against H;. That 
is, the theorem is really i.dependent of the hypothesis, and depends solely 
on the alternative and o the class of tests admitted into competition, in 
our case the class of all tests having structure S(e) with respect to II. The 
same remark obviously also applies to most stringent tests. 

Let us now consider a special class of partitions. Let Z stand for the m 
groups of random variables (Zi, --- , Zis;) (¢ = 1, --- ,m) and let I denote the 
partition under which two points z and 2’ are equivalent provided they can be 
obtained from each other by a permutation of coordinates which however 
permutes only the coordinates within the m groups. Let u be the power of a 
one-dimensional measure v, and assume that the probability distribution of Z 
is absolutely continuous with respect to » and that the Z’s are independently 
distributed, so that 


(5.1) P{Ze A} = [ ID bile: dv(e.). 


Under these assumptions consider the hypothesis H that f;; is independent of 7, 
that is, that the Z’s are identically distributed within each group. It easily can 
be shown that not all admissible tests of H that have size e, have structure S(e). 
However a generalization of a result of Feller [10] and Scheffé [5] for the case 
m = 1 and uw = Lebesgue measure, states that the only tests which are of size ¢ 
and similar for H, are the tests of structure S(e) with respect to II [11]. It 
follows that any test which is most powerful or most stringent for testing the 
hypothesis H’ of invariance under II for the class of generalised densities with 
respect to u, has the same property relative to the class of all tests which are 
similar for testing H. 
As an example, take problem 1b) of section 3. Here uw is Lebesgue measure, 

m is 1, and we put 

(rr . 
(5.2) ev U; for j ‘. a 

‘Vj. for j=k+1,---,k+l=s. 


It was shown in section 3 that the test based on | @ — 6 | , Pitman’s test, is most 
stringent for testing the hypothesis that the joint density of the U’s and V’s is 





42 E. L. LEHMANN AND C. STEIN 


symmetric in its k + / arguments against the alternative that the variables are 
independently normally distributed with common variance and such that 
E(U;) = &, E(V:) = » where é and 7 are any distinct real numbers. It follows 
now that the same test is most stringent similar for testing against the same class 
of alternatives the hypothesis that U1, --- , U;, Vi, --- , Vi are independently 
distributed, all with the same probability density. This is the hypothesis for 
which Pitman proposed his test, and the result just stated is a partial solution 
of the problem recently raised by Wilks [12], to determine the class of alternatives 
for which Pitman’s test is satisfactory. 

If we modify the example by taking for u instead of Lebesgue measure the 
k + lth power of the measure »* of section 4, we are dealing with characteristic 
variables U,,--- ,U;z,Vi,--:,Vi. We have shown earlier that if k = l 
the test based on | Z@ — 0 | is most stringent for testing the hypothesis of complete 
permutability against the alternative that the U’s and V’s are samples from two 
distinct populations of characteristic variables. If we add to this hypothesis 
the assumption of independence of all variables, we obtain a parametric problem, 
namely essentially the problem of testing equality of probability of success in 
two binomial populations corresponding to the same number of trials. It now 
follows that the test based on | a — @ | is most stringent for this problem. As is 
well known, it is also the uniformly most powerful, unbiased similar test. 

These two examples suffice to illustrate the type of result that can be obtained. 
It should perhaps be mentioned that the equivalence discussed at the beginning 
of this section, can be utilized also in the opposite direction. The fact, for 
example, that the test based on | @ — d | is known to be uniformly most powerful 
unbiased similar for testing equality of probability of success in two populations 
of characteristic variables from which the U’s and V’s are samples, proves that 
this test is uniformly most powerful unbiased for testing the hypothesis of 
complete symmetry for the joint generalized density of the U’s and V’s. 


6. Extension to infinite equivalence classes. The definition of a hypothesis 
of invariance given in section 1—in spite of the restriction to finite equivalence 
classes—was sufficiently general to cover the non-parametric problems that we 
wanted to study. It is possible however to extend the definition so as to allow 
infinite equivalence classes. In this concluding section we shall briefly outline a 
theory based on such a broader definition. This will enable us to point out a 
relationship between the approach of the present paper and the standard 
parametric theory. 

Let 3 be a space of points z and @ an additive class of subsets of 3. We 
define a partition of 3 into subsets {S,} as follows: Let S be some space, and 
for each t e S let S; be a measurable subset of 3 (i.e. an element of @) such that 
the S, are mutually exclusive and exhaustive. Let @) be the class of all Co e@ 
which can be expressed in the form 


(6.1) Co = U S: 


teDo 





SS 





a 


NON-PARAMETRIC HYPOTHESES 43 


and let S) be the class of all Do occurring in such relationships. For each 
t eS let G, be a specified probability measure over (?, , where @, is the class of 
A, such that A, e S;, 4:€@. Let Z be a random variable distributed over 3 
according to an unknown probability measure F. Let y(z) be that te J for 
which ze S,, and let T = ¥(Z). Let Ho be the hypothesis that for each t «I 
the conditional distribution of Z given Z e S, is G; , i.e. that there exists a proba- 
bility measure Q) over C) such that for all A e@ 


(6.2) F(A) = / GANS.) dQo(t). 


It is seen that we have essentially the situation described in section 1, except 
that there we assumed further that each S, was finite and for all ¢t, G, assigned 
equal probabilities to all points of S; . 

We say that a test ¢ of Ho has structure S(e) if the conditional expectation 
E,\e(Z)| of o(Z) given Z ¢ S; satisfies 


(6.3) E,le (z)] = [ g dG, =e forall ¢. 
St 

The lemmas and theorems stated below are straight-forward generalizations 
of those in section 1 so that no proof will be given. 

Lemma 1’. Any test ¢ of structure S(e) with respect to Ho is similar and of 
size e for Hy. 

Lemma 2’. If ¢ is any test of Ho of size <e, there exists a test ¢, of Ho having 
structure S(e) and such that 


(6.4) | eo dF > | o dF 


for all probability measures F, for which the conditional distribution of Z given 
Z ¢S, is absolutely continuous with respect to G, for all t. 

Suppose next there is defined another partition of Z into sets {.S.,} by means 
of a space U, and let C,, D; and @, refer to this second partition. We shall 
assume that for every t eS, ue U either Si, C S, or S,N S; is empty. Let Gi, 
be a specified probability measure over @, and suppose that for each t e J 

here exists a probability measure Q, such that for all A; €Q. 


(6.5) GA) = | EUAN S)AQw). 


If H, deixotes the hypothesis that for each u ¢ U the conditional distribution of 
Z given Z € S,, is G.,, , we can state 


TueoreM 1’. For testing Ho against H, at level of significance e, the totality of 
tests g which have structure S(e) and for which z, 2’ € S, implies o(z) = ¢(2’) form 
an essentially complete class of admissible tests. 

Let Ff, be a distribution not in Ho , and for each t e J let Gi; be the conditional 
distribution of Z given Ze S,;. We suppose that for each t ¢ S, Gi: is chosen 
to be a true probability measure, which is possible in most cases of practical 





44 E. L. LEHMANN AND C. STEIN 


interest (see Doob [13] for a discussion of this point). Then we have the equiv- 
alent of theorem 2: 
THEOREM 2’. Let 


(6.6) GulAd) =f gedGs + Gud NH) 

for all A, C S;, where in accordance with the Radon-Nikodym Theorem [14], g: 
is a non-negative function® integrable over S,, and H, C S; has G; measure 0 and 
does not depend on A;. For testing Ho against H, , a most powerful test of size € 
ais given by o(z) = ¢:(z) for z e S; where 


(1 if zeH, 


1 ¢ t t 
6.7) adi | = one 


a, if gz) =e 
0 if gilz)< ce 


where c,; and a; are so chosen that ¢ has structure S(e). 

Theorems 3 and 4 require no modification. 

As in the case of finite equivalence classes the results just outlined can be 
interpreted differently. Again the theorems are really independent of the 
hypotheses, but depend only on the alternatives and on the class of tests admitted 
into competition. This class of tests ¢ is in the present case defined by condition 
(6.4), that the conditiona expectation of ¢ given Z « S; equals «. But this is 
just the condition which in the standard approach to the problem of testing a 
composite parametric hypothesis for which T is a sufficient statistic, by means 
of similar regions is frequently found to be the necessary and sufficient condition 
for ¢ to be similar. (See for example [15]). For these cases therefore the 
hypotheses of the present section represent non-parametric analogues to which 
the same tests apply with the same optimum properties but without the a priori 
restriction to similar regions. 

As a simple illustration of this remark, let Z = (Z,,--- , Zn), and let 
T= » Z;. For the conditional distribution of Z given T = t take the uniform 
distribution over the sphere 7 = ¢, and for u take Lebesgue measure. Then the 
hypothesis H states merely that the joint probability density of the Z’s is a 


function only of >> Z;. If we add io this the assumption of independence of 
t=] 


the Z’s, we obtain the new hypothesis H’ that the Z’s are a sample from a 
normal distribution with zero mean. The tests ¢ for which the conditional 
expectation over each sphere is ¢, constitute the only admissible tests of H 
and the only admissible similar tests of H’. If as alternatives we consider that 
the Z’s are a sample from a normal distribution with mean é > 0, the test 


(6.8) 


2C 


ain lies 
V3(2: = z)? ” 








NON-PARAMETRIC HYPOTHESES 45 


is uniformly most powerful for H and uniformly most powerful similar for H’. 
If we do not restrict — to positive values, the test 


| x 
6.9 ne 
wi | Vie. — ay 
Student’s test, is uniformly most powerful unbiased and most stringent for 


testing H, uniformly most powerful unbiased similar and most stringent similar 
for testing H’. 


> C’, 





REFERENCES 


[1] R. A. FisHer, Design of Experiments, Oliver and Boyd, Edinburgh, 1935. 

[2] J. Nevman, K. Iwasktewicz anp St. Ko.opzieczyk, ‘‘Statistical problems in agri- 
cultural experimentation,’’ Roy. Stat. Soc. Jour. Suppl., Vol. 2 (1935), p. 107. 

[3] E. J. G. Pirman, ‘Significance tests which may be applied to samples from any propor- 
tion, Roy. Stat. Soc. Jour. Suppl., Vol. 4 (1937), p. 119; II. The correlation 
coefficient test, Roy. Stat. Soc. Jour. Svppl., Vol. 4 (1937), p. 225; III. The 
analysis of variance test, Biometrika, Vol. 29 (1938), p. 322. 

[4] E. L. LEHMANN anv C. Stein, ‘Most powerful tests of composite hypotheses. I. 
Normal distributions,’’ Annals of Math. Stat., Vol. 19 (1948). 

[5] H. Scuerrfé, ‘‘On a measure problem arising in the theory of non-parametric tests,” 
Annals of Math. Stat., Vol. 14 (1943), p. 227. 

[6] A. Wap, ‘‘An essentially complete class of admissible decision functions,’’ Annals 

_ of Math. Stat., Vol. 18 (1947), p. 549. 

[7] G. Hunt anp C. Stern, ‘‘Most stringent tests of statistical hypotheses,”’ unpublished. 

[8] A. WaLp anp J. Wo.trow1tTz, ‘‘An exact test for randomness in the non-parametric 
case, based on serial correlation,’’ Annals of Math. Stat.,.Vol. 14 (1943), p. 378. 

[9] M. C. K. Tweepte, ‘‘Functions of a statistical variate with given means, with special 
reference to Laplacian distributions,’’ Cam. Phil. Soc. Proc., Vol. 43 (1947), 
p. 41. 

[10] W. Feuer, ‘‘Note on regions similar to the sample space,”’ Stat. Res. Memoirs, Vol. 2 
(1938), p. 117. 

[11] E. L. LEHMANN Anp H. Scuerrf, ‘‘Completeness, similar regions and unbiased estima- 
tion,”’ unpublished. 

[12] S. S. Wixxs, “Order Statistics,’? Am. Math. Soc. Bull., Vol. 54 (1948), p. 6. 

[13] J. L. Doon, “Asymptotic properties of Markoff transition probabilities,” Trans. Amer. 
Math. Soc., Vol. 63 (1948), footnote p. 399. 

[14] S. Saks, Theory of the Integral, Stechert, 1937. 

[15] J. NeyMaAn AND E. S. Pearson, ‘‘On the problem of the most efficient tests of statistical 
hypotheses,’’ Roy. Soc. London Phil. Trans., Ser. A., Vol. 231 (1933), p. 289. 

[16] A. WALD, On the Principles of Statistical Inference, Notre Dame Mathematical Lectures, 
Number 1. 








ESTIMATION OF THE PARAMETERS OF A SINGLE EQUATION IN A 
COMPLETE SYSTEM OF STOCHASTIC EQUATIONS” 


By T. W. AnpErRson’® anp Herman Rvustn‘ 


Columbia University and Institute for Advanced Study 


1. Summary. A method is given for estimating the coefficients of a single 
equation in a complete system of linear stochastic equations (see expression 
(2.1)), provided that a number of the coefficients of the selected equation are 
known to be zero. Under the assumption of the knowledge of all variables in 
the system and the assumption that the disturbances in the equations of the 
system are normally distributed, point estimates are derived from the regressions 
of the jointly dependent variables on the predetermined variables (Theorem 1). 
The vector of the estimates of the coefficients of the jointly dependent variables 
is the characteristic vector of a matrix involving the regression coefficients 
and the estimate of the covariance matrix of the residuals from the regression 
functions. The vector corresponding to the smallest characteristic root is 
taken. An efficient method of computing these estimates is given in section 7. 
The asymptotic theory of these estimates is given in a following paper [2]. 

When the predetermined variables can be considered as fixed, confidence 
regions for the coefficients can be obtained on the basis of small sample theory 
(Theorem 3). 

A statistical test for the hypothesis of over-identification of the single equation 
can be based on the characteristic root associated with the vector of point 
estimates (Theorem 2) or on the expression for the small sample confidence 
region (Theorem 4). This hypothesis is equivalent to the hypothesis that the 
coefficients assumed to be zero actually are zero. The asymptotic distribution 
of the criterion is shown in a following paper [2] to be that of x’. 


2. A complete system of linear difference equations. In many fields of study 
such as economics, biology, and meteorology the occurrence of values of the 
observed quantities can be described in terms of a probability model which, as a 
first approximation, is a set of stochastic equations. Consider a (row) vector y; 
of quantities which are observed at time ¢. Suppose that these quantities are 
jointly dependent on a vector z; of quantities “predetermined” at time ¢ (i.e., 
known without error at timet). Some of the coordinates of z; may be coordinates 


1 This paper will be included in Cowles Commission Papers, New Series, No. 36. 

2 The results in this paper were presented at meetings of the Institute of Mathematical 
Statistics in Washington, D. C., Aprif 12, 1946 (Washington Chapter) and in Ithaca, N. Y., 
August 23, 1946. 

3 Fellow of the John Simon Guggenheim Memorial Foundation; Research Consultant 
of the Cowles Commission for Research in Economics. 

4 National Research Fellow; Research Consultant of the Cowles Commission for Re- 
search in Economics. 


46 


’ —E 
a= 


ee 
a 


ee 


a gg REET 


ESTIMATION OF PARAMETERS 47 


of Yr-1, Yt-2 , etc.; other coordinates of z; are quantities which are assumed given 
constants. The set of vectors y,(t = 1, 2, --- , T) are called endogenous. The 
part of the set z, which does not consist of lagged endogenous variables is called 
exogenous; these are treated as “fixed variates.” For convenience we shall 
think of ¢ as indicating a point of time, although it may in many cases indicate 
the ordering of a sample in another dimension, or, indeed, the ¢ may indicate 
simply a numbering of the observations (if z is entirely exogenous). In a 
dynamic economic model the endogenous variables are economic quantities 
such as amount of investment, interest rate, amount of consumption, etc. The 
exogenous variables are those quantities which are considered to be determined 
primarily outside the economic system, such as amount of rainfall, amount of 
government expenditures, time, etc. 

A simple provability model may be set up on the assumption that these 
quantities approximately satisfy certain linear equations. Specifically the model 
is 


(2.1) Byyt + Tyee = €1 


where e; is a (row) vector having a probability distribution with expected value 
zero and B,, and I, are matrices, the former being non-singular. Primes (’) 
indicate transposition of vectors and matrices. If there are G jointly dependent 
variables, there are G component equations in (2.1); that is, there are as many 
equations as there are variables depending on the system. The fact that y, 
and z,; do not satisfy linear equations exactly is indicated by setting the linear 
forms not equal to zero, but equal to random elements, called disturbances. 
We will call the component equations of (2.1) structural equations, for they 
express the structure of the system. For example, one equation involving the 
amount of goods consumed, the prices of these goods, the size of the national 
income, etc., might describe the behaviour of the consumers. Another equation 
involving interest rate might relate to the behaviour of investors. 

It has been shown [7], [11], that in general one cannot use ordinary regression 
methods to estimate the matrices B,, and T',, and the parameters of an assumed 
distribution of the disturbances. Mann and Wald [9], for a special class of 
systems, and Koopmans, Rubin, and Leipnik [11], in a more general case, have 
obtained maximum likelihood estimates of all of the parameters for the case of 
the e, having a normal multivariate distribution. 

Since B,, is non-singular, we can rewrite (2.1) in a different form, called the 
reduced form, 


(2.2) yi = —BT att + Bedi, 
or as 

(2.3) yr = Tyee + 1 
where 

(2.4) My = —ByT ys, 


(2.5) n. = Bre. 








48 T. W. ANDERSON AND HERMAN RUBIN 


If ¢, has a normal distribution, so does 7:. For a given ¢ then, we can consider 
the model as specifying a distribution of y; with conditional expected value z;II,, . 

It is clear that we can multiply (2.1) on the left by any non-singular matrix 
and obtain a system of equations which defines the same distribution of y;. 
On the other hand, it has been shown that the only transformations of (By,Tys) 
which preserve the linearity of the system of equations are multiplications on the 
left by non-singular matrices. If there are a priori restrictions on (ByTys), 
the set of matrices which result in new coefficient matrices satisfying these 
restrictions is correspondingly decreased. If the set of admissible matric 
multipliers includes only diagonal matrices the system of structural equations 
is said to be identified. In this case only multiplication of all coefficients by a 
given constant is permitted. 

Knowledge of the distribution of y; given z; is obviously equivalent to knowl- 
edge of II,, in (2.3) and the distribution of 7,. When the system is identified, 
the matrix B,, and 


(2.6) Tys = —ByTys 


are determined uniquely except for multiplication on the left by a diagonal 
matrix. Thus identification of a system is equivalent to the possibility of 
inferring the structural equations from knowledge of the distribution. The 
estimation of all coefficients of B,, and Ty, has been considered in [11]. 


3. A single identified equation of a complete system. In many studies the 
investigator may be interested only in a specific equation of the system, say, 


(3.1) By + Yet =¢, 


where ¢; is a scalar disturbance. The investigator may not be interested in the 
entire system (2.1) of which (3.1) is one component. Since a considerable 
amount of computation is necessary to estimate all parameters of a complete 
system, there arises the problem of estimating only the coefficients of a single 
equation. It is desiranle to do this with the least possible restrictive assumptions 
about the part of the system which is not the selected structural equation. In 
order to treat the selected equation at all, we require that it is identified; that is, 
that there are certain restrictions on (8, , ys) such that no linear combination of 
rows of (B,,T ys) satisfies these restrictions other than a constant times (8, , Ys). 
It is not necessary to assume that every component equation is identified; that is, 
that the entire system is identified. 

We shall suppose that the restrictions imposed are that certain coefficients 
are zero. We can arrange the components of the vectors so that the restrictions 
are 


(3.2) (By ’ Ys) = (8, 0, 7, 0), 
where 


(3.3) B = (6',--- , 8") 


ESTIMATION OF PARAMETERS 49 


has H coefficients not assumed to be zero and 
(3.4) | y= (7,°*+,7") 


has F coefficients not assumed to be zero. 

It will be convenient to divide the G components of y; into two groups (in 
number H and G — H, respectively), and the K components of z; into two groups 
(in number F and D respectively) according to whether or not the components 
enter into (3.1) with coefficients not assumed to be zero. Let 


(3.5) Yt = (Xt, 7), 
(3.6) Ze = (ur, r), 
where 

(3.7) a = (tn, °° , Cen), 
(3.8) Tr = (Ta, +++ 4 Tt.e-H), 
(3.9) Ur = (Un, +++ , Ur), 
(3.10) ve = (Ma, +++ , UD). 
Then the selected equation is 

(3.11) ; Bur + yur = ft. 


Now let us see how the identification is accomplished. Partitioning I,, into 
H and G — H rows and F and D columns as 


—_— Iz, Tee 
As Thal, 


we can write the reduced form (2.3) as 


(3.12) x, = Weyue + Uae + 8, 
(3.13) ry = Wu: + Wve + £2, 
where 
me = (6, &). 

Multiplying the above equation with (8, 0) we obtain 
(3.14) Bx, = BIzyu: + Bw: + Bde. 
Since this must be identical to (3.11) we must have 
(3.15) y = —BIlz 
(3.16) 0 = —§ll.. 


The matrices II, and II,, are defined by the distribution of x; given u, and »; 
(for at least K = D + F linearly independent values of u;, v:). The equation 








50 T. W. ANDERSON AND HERMAN RUBIN 


(3.11) is identified if and only if the solution of (3.15) and (3.16) for 6 and y is 
unique except for a constant of proportionality. This depends on the rank of 
II,, bemg H — 1. Thus a necessary and sufficient condition that (3.11) is 
identified is that the rank of x, on v, be H — 1. In particular this implies that 
the number of coordinates of v; (the number of zero coefficients in y,) be at least 
H — 1. It can easily be shown that this condition is equivalent to requiring 
that the rank of the matrix obtained by selecting the G — H columns of B,, 
and the D columns of I,,. corresponding to the coefficients assumed zero in the 
selected equation is G — 1. This is the condition given by Koopmans and 
Rubin [11]. Other homogenous linear restrictions can be put in this form. 

If the vector e; is normaly distributed with mean zero the vector 7; is normally 
distributed with mean zero. Let the covariance matrix of 6; be Q2,,. Then 
the variance of ¢; = Bd; is 


(3.17) o = B2,z0'. 


The constant of proportionality in 6 may be determined by setting the variance 
of {:, 0, = 1; another normalization is 


(3.18) B= 1, 
where 6’ is the ith coordinate of 8. In general the normalization can be written as 
(3.19) B%,.8’ = 1, 


where ®,, can be either a known constant or can be a known function of unknown 
parameters. 

As an estimation procedure for 8 and y and D = H — 1, M. A. Girschick 
suggested in an unpublished note that one solve equations (3.15) and (3.16) 
with (IIz,, Iz») replaced by (P2,P.»), the sample regression of x on u and v. 
By these means Girschick found confidence regions (see section 8) for the 
parameters of a two equation system. A similar idea lies behind a method of 
O. Reiersgl [10]. 

The present paper develops a method for handling the case of D > H. In 
this case the rank of P,, is usually H, thus giving no admissible estimate of 8. 
The proposed method follows the approach used in discriminant problems. 

In a second paper [2] the present authors shall give asymptotic properties of 
these estimates that give a certain justification for the use of them. Under 
very general assumptions concerning the v, and the e, we prove that these 
estimates are consistent. These hypotheses permit the investigator to neglect 
some predetermined variables absent from his particular equation. Alternative 
assumptions include the case of the other G — 1 equations being non-linear. 
Finally, it is shown that the estimates are asymptotically normally distributed. 
For this result it is not necessary to assume that the disturbances are normally 
distributed, or even that they have identical distributions. 


4. A description of the estimation procedure. In a sense the dependence 
of the endogenous variables x; on the predetermined variables u; and »; is given 








ESTIMATION OF PARAMETERS 51 


by the matrix (Iz, Iz.) of regression coefficients of x, on u, and v,. The 
interdependence of the coordinates of 2, indicated by the selected equation 
nullifies the dependence on 2, ; that is, 


(4.1) BIlz, = 0. 


Suppose we wish to estimate 6 and y from a sample of 7’ observations: 


(a1, 21), (te, 22), --* (ar, Zr). The information we need can be summarized 
in the second order moment matrices 


7 
(4.2) Me = 5 D210, 
T =1 
1 /< : 
(4.3) Mz = (MiuM) = T (2 riw D210), 
Eww Ev 
UtU Uyv 
(4.4) ur, = (Mu Mw) 1 ——— 
aa a he | 


, e 

Do Us Dy VEY 

t=1 t=1 

Since one coordinate of u,; may be unity there is no advantage in taking these 
moments about the mean. We shall find it more convenient to use instead of 
v, the part of v, that is orthogonal to u; ; that is, we shall use 


(4.5) st = v1; — My Myhui. 

The moments are then M,,, Miu, Muu, 

(4.6) Ma. = Ma — MaMyiMw , 

and 

(4.7) Ma = Mw — MuMiiMw. 
We can express the reduced form as 

(4.8) av, = Tevut + Tres: + 81, 

where 


| — x. + TwM cM. , 
IIz, = Il. 


(4.9) 


An estimate of II,, is the regression of x on s, 
(4.10) P.. = MaM. . 


To estimate 8 we take the 8 that makes 6P,, smallest in the metric determined 
by the moment matrix of the residuals 


(4.11) Wee = Mez — PuMuPl, — PauMuuPis ; 








52 T. W. ANDERSON AND HERMAN RUBIN 


where 
(4.12) Px. = MMi . 


This is the natural generalization of least squares; the greatest weight is given 
to the component with least variance. This estimate is the vector satisfying 


(4.13) (PisMysP ts — vW2z)b’ = 0 
which is associated with the smallest root of 
(4.14) | PesMisPie — vWez| = 0. 


This is normalized and the estimate of y is —bP., . 

In section 5 we derive these estimates by the method of maximum likelihood 
under certain assumptions. Although it is assumed that the disturbances are 
normally distributed for this derivation, the estimates can be used in more 
general situations. This theory is in one sense a special case of the theory of 
estimating a matrix of means of a given dimensionality which is an extension 
of the discriminant function theory [5]. For an application of this method of 
estimation see [6]. 


5. Derivation of maximum likelihood estimates. We derive the estimates of 
8, y, and o° under the following assumptions: 
AssumPTION A. The selected structural equation 


(3.11) Bri + yur =f 


is one equation of a complete linear system of G stochastic equations. The equation 
is identified by the fact that if H is the number of coordinates in x; there are at least 
H — 1 coordinates in v,, the vector of predetermined variables not in (3.11) but 
in the system. 

ASSUMPTION B. At time t all of the coordinates of z, = (uz, vz) are given. 

AssumpPTION C. The coordinates of z; are given functions of exogenous variables 
and of coordinates of yt, Y:-2, °°: . If coordinates of yo, y-1, +++ are involved 
in 2,, they will be considered as given numbers. The moment matrix Mg, is non- 
singular with probability one. 

AssumPTION D. The disturbance uectors 5, are distributed serially independently 
and normally with mean zero and covariance matrix Qzz . 

We shall consider normalizations (3.19) where @,, may be a function of otker 
parameters, but 


(5.1) d,,/d8 = 0. 


We can state the results in a theorem: 
THEOREM 1. Under assumptions A, B, C, and D the maximum likelihood 
estimate of B is 


(5.2) 8 = b/vV/b6.20'; 





ESTIMATION OF PARAMETERS 53 


where b is the solution of 
(4.13) (PsMssP 2, — vW.z)b’ = 0 


corresponding to the smallest value of v and Pz, is defined by (4.10), M.. by (4.6), 


and Wzz by (4.11). An estimate of y based on the maximum likelihood estimate 
feu ts given by 


(5.3) 4 = -6P., 
where P,., is given by (4.12). The estimate of o° is 
(5.4) 6° = (1 + v)/bb,2b’ 
of 

(5.5) bWd’ = 1. 


We apply the method of maximum likelihood to 
7 
(5.6) L= (Qn) 37” | j s exp {-3 os (x1 — Tz.) Mrz (24 — a) 
t=1 


under the restrictions (4.1) and (3.19). Replacing v,; by s,; and adding (4.1) 
and (3.19) multiplied by Lagrange multipliers \ (a vector of D coordinates) and 
¢ respectively to the logarithm of L we obtain after division by T 


A = —}3H log 2a + 3 log | Oz | + Bllad’ + $(8%228' — 1) 


-- 7 
6.7) — = Fe — all, ~ aes ~ tied — aed. 
2T 1=1 
Differentiating (5.7) with respect to 8, we obtain 
0A , , 
(5.8) 0B = Heed a 26228 7 


Setting this equal to zero and multiplying by 8, we have 
BI zd’ + 2p8%,,8' = 0. 


By virtue of (4.1) and (8.19), the Lagrange multiplier ¢ must be zero. Hence, 
as far as the derivarives of (5.7) are concerned the restriction (3.19) does not 
enter. The setting of the derivatives of (5.7) equal to zero and (4.1) will define 
8 except for a constant of proportionality which is finally determined by (3.19). 
For convenience in deriving the estimates we shall use the normalization 


(5.9) 62,28" = 1. 


The derivatives of (5.7) with respect to the coordinates of 2, , Iz. , Iz. , and 
B are set equal to zero, resulting in 


Qe: _— 1 = Mullis -. Metin bi fi..M.z 


9.10 * ’ ’ 
“ ) os feuwM uz + NewM uuflen + fi..M..[l,s ? 





54 T. W. ANDERSON AND HERMAN RUBIN 


(5.11) OF3(Mzs — WeeMe) + BX = 0, 
(5.12) OF (Mau — TeuMunu) = 0, 
(5.13) fi,’ = 0. 

Solving (5.12) for f1,, , we obtain 
(5.14) Ban * Pray 
defined by (4.12). Solving (5.11) for fl. , we obtain 
(5.15) fee = Pos + Qee8'AMie 
Multiplying (5.15) by 6 and solving for \, we obtain 
(5.16) \ = —6P..Ms. 
Substitution into (5.15) gives 
(5.17) fee = (I — O228'8) Pos 

In view of (5.14) and (5.17) we can write (5.10) as 
(5.18) Gee = Was + Ges’ BPM eP eB! Ber - 
Let 
(5.19) BPaMaPb’ = w. 


Then multiplication of (5.18) on the right by 6’ with use of (5.9) gives 
0,20" — W228" +> G228’BP cM oP 208" 


— Wez + yQe2(3’, 
that is, 
(5.20) 0,20" = oi Wz’. 
l—4yp 
Equation (5.13) can be written as 


(5.21) PaMuPib' — p&228’ = 0 
by substitution from (5.16), (5.17) and (5.19). Combining (5.20) and (5.21) we 


obtain 


(5.22) (PesMePist— vWz)h’ = 0, 
where 

(5.23) vy = p/(1 — p). 

For (5.22) to have a solution, v must be a root of 
(4.14) | PssMssP2s — vWee| = O. 


Substituting from (5.20) into (5.18) we obtain 


. 1 ‘ ies i i 
(5.24) Qe = Was + b J W228’ BWaez = WF os + v(1 + v) W228’ BW zze 





like 
the 
in | 


fro 


v2 





ESTIMATION OF PARAMETERS 55 


To determine which root of (4.14) to use we shall compute the value of the 
likelihood function when these estimates are used. It will be convenient to use 
the solution b of (4.13) with normalization (5.5). Thus b is proportional to 8; 
in fact, since 


AA A 1 A A 
BQ:z2 8’ = BW 22 2B" 
i», 
from (5.20), we see that 


B= b/1l —p=d/V1 +>. 


Let the other solutions of (4.13) be ,-+++,bz, with corresponding roots 
v2, °** , Ve, and 


b 
be 
B* = | 

bu 
Since 
(5.25) . | Gee | = | Wee + vWasb’bWiz | , 
we have 
(5.26) | B* || Q.2 || BY’ | = | I + vB*W..b'bW,.B*’ | . 
Since 

bW,,B*’ = (1,0, -++ , 0), 

and since 


| Bt |? =| W.2|, 
we deduce from (5.26) 
| G2 | = | Wee | (1 + »). 
Multiplying (5.10) by 7; , taking the trace, and substituting in (5.6) we obtain 
(5.27) i = (Qme)?™ | Wee | O71 +»). 


This is a maximum if v is the smallest root of (4.14). 
The theorem now results. The expression for o follows from 


¢ = 66,.8’ = bQ,2b’ /bb,,.b’. 


If ,, is a known constant matrix, ®., = ®,,; if 6, is a function of the param- 
eters, ,, is the same function of the estimates. 





56 T. W. ANDERSON AND HERMAN RUBIN 


If we define 
(5.28) 4 = —éfii., 
we have by (4.9) 


(5.29) 4 — —B( feu Pros fiesM ul _ grea 


Since 8 annihilates [1,, , (5.3) results. 
The estimate of I1,, is given by (5.17) and the estimate of Q,, is 


(5.30) Q.2 = Wee + vWezb'dW:z . 


6. The likelihood ratio test of restrictions. It has been assumed that the 
selected structural equation is identified by imposing the restrictions that certain 
coefficients are zero. It was noted in Section 3 that at least G — 1 such restric- 
tions are necessary. If D, the number of restrictions on the predetermined 
variables, is more than H — 1, we can test the hypothesis that these D coefficients 
are zero against the alternative that only a smaller number are zero. This is 
equivalent to a test that II, is of rank H — 1 against the alternative that the 
rank is H. 

It can be seen intuitively that the smallest root v of (4.14) indicates how near 
P,, is to being singular. This statistic can be used to test the hypothesis that 
II,» is of rank H — 1. The test is similar to the test of rank suggested by P. L. 
Hsu [8]. The test is stated precisely in the following theorem: 

THEOREM 2. Under assumptions A, B, C, and D the likelihood ratio criterion 
for testing the hypothesis that 11,, is of rank H — 1 against the alternative that it is 


of rank H is 
(6.1) (1 +»), 


where v is the smallest root of (4.14). 
Proor. If there is no restriction on II,, , the maximum likelihood estimate of 
Tze iS Pr, , Of Iz, is Pr, , and of Q,, is Wz,. Then the likelihood function is 


(6.2) (Qe) *™ | Wi. | ~*”. 
The ratio between this and the likelihood function (5.27) maximized under the 
hypothesis that the rank of II,, is H,— 1 is (6.1). 


It is proved in the paper following the present one that under certain conditions 
(more general than those of Theorem 2) 


(6.3) —2 log [(1 + v)?7] = T log (1 + ») 


is distributed asymptotically as x” with D — H + 1 degrees of freedom. Thus 
an approximate test of significance is given by comparing (6.3) with a significance 
point of the x”-distribution with degrees of freedom equal to the excess number of 
coefficients required to be zero (i.e., the number beyond the minimum required 
for identification). 








ESTIMATION OF PARAMETERS 57 


7. Computational procedure. The estimation procedure in sections 4 and 5 
does not indicate the most efficient method for computing those estimates. The 
procedure given here is believed to be efficient for ordinary computational equip- 
ment and can easily be adapted for sequence-controlled computing machines. 

Let us see what expressions occur in the estimation procedure for 6 and y. 
We find that we must first know P,.M,,P1.;, W2:, and P,, ; these will suffice 
if ,, is constant or Q,, to estimate B, y, and o. In what follows, we shall 
assume the normalization is 8’ = 1, as the results for other normalizations 
follow immediately. Examining the estimation equations, we see that we may 
use any matrices proportional to the moment matrices. If equation (3.11) 
has a constant term, it is better to use moments about the mean and estimate 
the constant term by setting the calculated mean of the disturbances equal to 
zero. One possible method of correcting for the mean is to calculate 


(7.1) ms, = T : ma — (= p(X a) 


The estimation procedure for 8, o’, and the remainder of y is not affected by 
correcting for the mean. The computational procedure indicated here is 
unchanged except for a factor of proportionality in the equation for o° if a 
different form of correction for the mean is used. 

7.1. Calculation of M..M;z.M,; and W,,. It is known that 


(7.2) © Wee = Miz — MaMi Miz. 


We shall use (7.2) to compute W,,. We shall compute M,,M;7,M,, by the 
method given by Dwyer [4]. Let us denote the element in the ith row and 
jth column of M,, by a;;, and the element in the 7th row and jth column of 
M,, by 6;;. Let us construct the following array 


CuiCig *°° Cix @11 C12 *°* Grn 
dndiz +++ dix fufie +++ fin 
Con °° * Com Cn Con °° * Con 
dx +++ dex fa for -++ fon 


CxK €xi€x2 *** Cx# 


drx Srifxe -++ fru 


where . 

C3 = Aj >, awe; lsisjsk, 

k<i 
ej; = bi — Dd. duen, ist*ShilsjsF 

kei 
Cij - " 
—~e Ss 415553 4 
fua= 4, 1<i<K1<j<u 


Cii 








58 T. W. ANDERSON AND HERMAN RUBIN 


Then the element in the 7th row and jth column of the symmetric matrix 


M..Mz Mz: is m 
K 
Zz Cri fk; « A 
k=1 
If we wish to estimate several equations in the system by this method, this & 
step need only be done once, as M,,M;;M,, and W,, do not depend upon the ( 
equation (except that x would be enlarged). | 
7.2. Computation of P.,. Weshall compute P,,, by the abbreviated Doolittle ¥ 
method. Let us now denote the element in the ith row and jth column of | . 
Miu by a:;, of Mz by bi;. Then let us perform the previous operations, not | : 
including the last step. We may arrange the work, if only one equation is to { ' 


be estimated, so that this is already done. Then define 


9:=fi—-— Dw dags, Sis PF 1I<j<d. 
t<k<F 


Then the element in the zth row and jth column of Pz is gj: . 
7.3. Computation of P.:M..P:.. We know that 
(7.3) P.M oP os — MM; Mz a MaMiuuMuz . 


Let us compute P,,M sePzs , using (7.3). We must first calculate M,,M.LM.,. . 
We may do this either by the method of section 7.1, or as Pz-Muz. 
7.4. Computation of v, 8, and ¥. We shall use 


(5.3) y == B Prx 
to compute ¥ after has3 been computed. 
Case 1) H = 1. In this case the vector B = (1), > = P..MssPis/Wez . 


Case 2) H = 2,D > 1. Let a;; denote the element in the 7th row and jth 
column of P,,;M,,P!,, wi; the element in the ith row and jth column of W,,. 
Define 


ko = | PMP rs | ’ 
ky = | Wes | 
ky = + (aw + Ax2Wi, — Zas2W2). 
Then 7 
_ he — Vii — kok | 
= ee 
Let © = P,,.M.Pi: — vWiz. Then 
B' = 1, 
pea %, 


B12 B22 





ESTIMATION OF PARAMETERS 59 


Case 3) H = 2,D = 1. In this case » = 0. Then 6 = P,.M..P:, , and 8 
may be computed as before. 
Case 4) H > 2,D > H — 1. Using the procedure of section 7.2, compute 


A = (PisMosP)ic ‘Wrz. Let us multiply equation (5.22) by — : (P2:MsP 2s)’, 
v 

and set 1/vy = ». We obtain 

(7.4) (A — rp’ = 0, 


where A is the largest characteristic root of A. Then we may employ the 
method of Aitken [1] to estimate \ and /. Let qo be an approximation to £. 
The column of A with largest absolute values is generally a satisfactory 
approximation. Define 

qi = Aq ’ 


7 
Aj = = . 
qi-1 

The quantities \;; approach \ as 7 increases, and the normalized vectors q; 
approach 6. The convergence may be accelerated by the methods givin by 
Aitken. The normalization should not be carried out until the \;; are sufficiently 

close for different 7. 
Case 5) H > 2,D = H —1. Let us go through the procedure of section 7.2 
with A = P,,M,.P2., and with no matrix B. Then czy = 0. Set gy = 1, 

and compute 


g = Dd) dug, 


i<k<u 
Then 
ps = =. vy =0, 


7.5. Computation of 6. We have 
(7.5) e = 82,28" = (1 + v)BW 228’. 


. ° aa 2 ° 

If we use the m*’s instead of the m’s, we must divide by 7”, and if other factors 
of proportionality are used, we must divide by them. o’ is in general biased, 
but the bias depends upon the nature of the complete system, and is not easy to 
calculate. The bias is of the order of 1/7. 


8. Confidence regions based on small sample theory.’ If all of the pre- 
determined variables in the system are exogenous (i.e., ‘‘fixed”), we can obtain 
confidence regions for the coefficients of one equation on the basis of small sample 
theory. To do this we require only that the disturbance of the selected equation 
be normally distributed; that is, the linear form in the observations Bar + yur 





5 We are indebted to Professor A. Wald for assistance in simplifying our approach to this 
problem. 





60 T. W. ANDERSON AND HERMAN RUBIN 


is normally distributed with mean zero and variance o°. The regression of this 
on fixed variates is normally distributed and certain quadratic forms in these 
linear forms have x”-distributions. On the basis of this we can set up confidence 
regions for the coefficients. 

In addition to assumptions A and B we use the following: 

AssuMPTION E. All of the coordinates of z; = (u; vs) are exogenous. The 
moment matrix M,, 1s non-singular. The disturbances of the selected equation are 
distributed independently and normally with mean 0 and variance o’. 

Suppose we have a set of observations (11, w,0),°:: (tr, Ur, Ur). If 
we know § and y we can obtain T values of 


(8.1) we = Br, + yur, t=1,---,T. 


The sample regression coefficients of w; on wu, and s; are 


- 
(8.1) in F D wie Mes = 6Ma Mar + 7, 
(8.3) * F > ws: Mz = BM Mit. 

c=1 


The two vectors ¢ and e are distributed independently and normally with mean 0 
and covariance matrices 


(8.4) &(c'c) = o Mil, 
(8.5) &(e’e) = o Mie, 


Hence (by usual regression theory) 


(8.6) C= z cM,,,c’ = 2 (8M uM aM us + BM uy’ + 7M.,:6" + yM uu); 
o~ o” 
— 1 , 1 A a , 
(8.7) E = —eM,,e’ = —BM,,M,.M.:6 
o~ o~ 


_ = B(M av a MiuMy.Mu)(Mor _ Md su we) (Moz — MuMziMuz)e’; 
c. 


1/iv 2" _ 
(8.8) A= (=) wi-C-—E) = = 6W.:8’, 
XT = o 
are distributed independently as x” with F, D, and T — K degrees of freedom, 
respectively. The ratio of any two has an F-distribution. 

On the basis of these considerations we can obtain the desired confidence 
regions. 

THEOREM 3. Suppose assumptions A, B, and E are true. If the normalization 
1s 


(8.9) B®," = 1. 





whe 
all 








ESTIMATION OF PARAMETERS 61 


where ®,, is a given matrix, (a) a confidence region for B of confidence ¢ consists of 
all B* satisfying (8.9) and 


6*M,.Mi M.28*' T — K 
8.10 See ae 
aaa OW 8’ D 
where Fp,r_x(¢€) is chosen so the probability of (8.10) for B* = B is «. (b). A 


confidence region for 8 and y simultaneously consists of all 8* and y* satisfying 
(8.9) and 





< Fp, r—x,(€), 


B*Ma Mai Mus B™ + 6*M avy” + 7*Mav6™ + 9*Muey* + 6*M MIMSY 
B*W.28*’ 
(8.11) aw 
Te ernst 
(c) If the normalization is o° = 1, then a confidence region for 8 of confidenc® 
€1€2 consists of all B* satisfying 


(8.12) 6*M..M..M.28*" < xb(«), 
(8.13) X’r-x (€) < B*W.8" < X°r_-K (e), 


where x‘p(€) is chosen so that the probability of (8.12) is «, when B* = B and x?7_x(€) 
and X*7_x(€) are chosen so that the probability of (8.13) is € when B* = B and 


(8.14) © x(e@) J 1 < x (e). 


(d) A confidence region for 8 and y simultaneously consists of all 8* and y* satisfying 
(8.13) and 


(8.15) 6*M,MyiM.:8" + B*Miuy" + y*MuB™ + y*Muuy™’ 
+ 6*M..M..M.8" < xx (4). 


Region (c) is the interior of an ellipsoid and an ellipsoidal shell in the 8*-space; 
region (d) is similar in the 8*, y*-space. Region (a) consists of the intersection 
of the quadric surface (8.9) and the interior of a cone in the 8*-space; region (b) 
is similar in the 6*, y*-space. 

It is clear that there are many other ways of constructing confidence regions 
by taking regression on other fixed variates. Of these the best seem to be those 
of theorem3. It has been proved [2] that the regions of theorem 3 are consistent 
in the sense that for sufficiently large T the probability is arbitrarily near 1 that 
all of the confidence region is within a certain distance of 8 or 6, y. For an 
application of this technique to economic data see a paper by Bartlett [3] who 
suggested this method independently. 


9. An approximate small sample test of restrictions. When 6* = 8, the 
probability of (8.10) is e. If 8* is replaced by 8 which minimizes the expression 








62 T. W. ANDERSON AND HERMAN RUBIN 


on the left, the probability is at least as great; it is, say, 1 — 6. This ratio is X, 
the smallest root of 


| 1 ” T 
1 a Ce. ~ i one W/o 
(9.1) |p Ma MatMue —  -—z Wee| = 0, 
Since 
T-K 
(9.2) A= TD VY; 


where » is the smallest root of (4.14), the probability of 


(9.3) va 7 F p,7r-x(€) 
is 6 < (1 — e). We summarize this as follows: 

THEOREM 4+. Under assumptions A, B, and E, the inequality (9.3), where v is 
the smallest root of (4.14), constitutes a test of the hypothesis that the coefficients of 
v, in the selected structural equation are zero of significance less than 1 — e. 

This test is simply an approximation to the test given in section 6. The 
exact probability, 5, of (9.3) is unknown; in fact the distribution of » depends on 
II,» and the distribution of 5,. However, since 6 lies between 0 and 1 — e, we 
know that if the test is used as though the level were 1 — e, the test will be 
“conservative.” 

Another approximate test of the restrictions can be obtained from the in- 
equality (8.11). If the hypothesis is rejected on the basis of one of these tests, 
the corresponding confidence region (for 6 or for 8 and y) is imaginary, for all 
B or B and y are excluded. It should be noticed that the use of a given ratio 
to test the hypothesis at significance level 6(<1 — e€) does not affect the con- 
fidence coefficient ¢ of the confidence region when the hypothesis is true. 


REFERENCES 


[1] A. C. ArrkKen, “Studies in practical mathematics II. The evaluation of the latent 
roots and latent vectors of a matrix,’’ Edinb. Math. Soc. Proc., Vol. 57 (1936-7), 
pp. 269-305. 

[2] T. W. ANDERSON AND HERMAN RvBin, ‘‘The asymptotic properties of estimates of the 
parameters of a single equation in a complete system of stochastic equations,” 
to be published. 

[3] M.S. Barrett, ‘A note on the statistical estimation of demand and supply relations 
from time series,’’ Econometrica, Vol. 16 (1948), pp. 323-329. 

[4] P. S. Dwyer, ‘‘Evaluation of linear forms,’’ Psychometrika, Vol. 6 (1941), pp. 355-365. 

[5] R. A. Fisuer, ‘‘The statistical utilization of multiple measurements,’’ Annals of 
Eugenics, Vol. 8 (1938), pp. 376-386. 

[6] M. A. GrrsHicK AND T. HAAveELnMo, ‘‘Statistical analysis of the demand for food: 
examples of simultaneous estimation of structural equations,’’ Econometrica, 
Vol. 15 (1947), pp. 79-110. 

[7] T. Haavetmo, ‘Statistical implications of a system of simultaneous equations,”’ 
Econometrica, Vol. 11 (1943), pp. 1-12. 


ESTIMATION OF PARAMETERS 63 


[8] P. L. Hsu, ‘‘On the problem of rank and the limiting distribution of Fisher’s test 
se function,’? Annals of Eugenics, Vol. 11 (1941), pp. 39-41. 
[9] H. B. MANN Anp A. WALD, “‘On the statistical treatment of linear stochastic difference 
equations,’”’ Econometrica, Vol. 11 (1943), pp. 173-220. 


‘10] Otav Rerersg¢i, ‘Confluence analysis by means of lag moments and other methods o! 


confluence analysis,’’ Econometrica, Vol. 9 (1941), pp. 1-24. 
'11] Statistical Inference in Dynamic Economic Systems, to be published as Cowles Com- 


mission Monograph No. 10. 








SOME SIGNIFICANCE TESTS FOR THE MEDIAN WHICH ARE 
VALID UNDER VERY GENERAL CONDITIONS! 


By Joun E. Watsu 
The Rand Corporation 


1. Summary. Order statistics are used to derive significance tests for the 
population median which are valid under very general conditions. These tests 
are approximately as powerful as the Student t-test for small samples from a 
normal population. Also the application of a test requires very little computa- 
tion. Thus the tests derived compare very favorably with the ¢-test for small 
sets of observations. Applications of these order statistic tests to certain well 
known statistical problems are given in another paper [1]. 


PART I. RESULTS AND DEFINITIONS 


2. Introduction. Consider n independent observations drawn from n popu- 
lations satisfying the conditions (A): 

1) Each population is continuous (i.e. its edf is continuous). 

2) Each population is symmetrical. 

3) The median of each population has the same value ¢. (If the 50% point 
of a continuous symmetrical population is not unique, the median ¢ of the popu- 
lation is defined to be the midpoint of the segment of 50% values.) 

It is to be emphasized that no two of the observations are necessarily drawn 
from the same population. Significance tests are derived to compare ¢ with a 
given constant value ¢p . 

A general method of obtaining one-sided and symmetrical tests is given in sec- 
tion 8. This general method furnishes tests which have significance levels of the 
form 7/2”, (r = 1,---,2” — 1). Each value of r can be attained for some one- 
sided test. Unfortunately tests obtained by the general method are very difficult 
to apply from a computational viewpoint. If nm > 10, the number of computa- 
tions required for the application of a test is prohibitive. 

To overcome the computational difficulty involved in using the general method, 
easily applied tests using order statistics are derived. These tests are based on 
order statistics of certain combinations of order statistics of the n observations, 
each combination being either a single order statistic of the n observations or 
one-half the sum of two order statistics. The tests are invariant under permuta- 
tion of the n observations and have significance levels of the form r/2”, 
(r =1,---,2"— 1). Table 1 contains a list of some one-sided and symmetrical 
tests for n < 15 (m1, --- , 2, represent the n observations arranged in increasing 
order of magnitude). Additional significance tests can be obtained by use of 
Theorem 4 of section 6. 


1 The results presented in this paper were obtained in the course of research conducted 
under the sponsorship of the Office of Naval Research. This research was performed while 
the author was at Princeton University. 


64 


SIGNIFICANCE TESTS FOR THE MEDIAN 69 


If a symmetrical population has a mean, the mean has the same value as the 
median. Thus if each population from which an observation is drawn satisfies 
the additional condition that its mean exists, the median tests derived in this 
paper are also tests of the mean. 

Although it is unlikely that conditions (A) are ever exactly satisfied in prac- 
tice, these conditions appear to be approximately satisfied in many practical 
situations. Moreover conditions (A) are of such a simple form that approximate 
verification can frequently be obtained without an extensive investigation. 

Certain of the order statistic tests are very efficient if the n observations are a 
sample from a normal population. Efficiencies are listed for some of the tests in 
Table 1. These tests are approximately as efficient as the Student t-test. (The 
efficiency of a test, more precisely the power efficiency, is defined in section 3.) 

The order statistic tests are competitive with the Student t-test. In choosing 
between the two types of tests the following considerations may be of interest: 

(a) The order statistic tests are valid under much more general conditions than 
the t-test. 

(b) The order statistic tests are almost as efficient as the t-test for small sam- 
ples from a normal population. 

(ec) The order statistic tests are more easily computed than the t-test. 

(d) For the case of a sample from a normal population and near significance 
the t-test gives more information than the order statistic tests. 

In some cases a set of n independent observations satisfying only 1) and 3) of 
conditions (A) can be transformed into observations approximately satisfying all 
of conditions (A) by an appropriate continuous monotonic change of variable. 
For example, replacing each observation by the logarithm of the value of the 
observation sometimes results in a set of observations having approximately 
symmetrical distributions. Since the transformation, say g(x), is continuous 
and monotonic, the resulting observations will have median g(¢) if the original 
observations have median ¢. Confidence intervals can be found for ¢ by first 
obtaining confidence intervals for g(@) on the basis of conditions (A) and then 
inverting. Significance tests can be obtained from these confidence intervals. 

The tests of Part I can be applied to furnish generalized solutions for several 
well known statistical problems. Some of these applications are given in another 
paper [1]. 

One application occurs in cases where there is reason to believe that condi- 
tions (A) are satisfied but there is no reason to assume that the populations from 
which the observations were drawn are even approximately the same. Perhaps 
the most common situation of this type is that in which the value of a certain 
quantity is experimentally determined by several different methods, all of which 
should theoretically yield the same result. Then there is no reason to believe 
that all the experimental values have the same precision. It may be permissible, 
however, to assume that each value is an observation from a continuous sym- 
metrical population and that all the populations have the same median. Then 
the order statistic tests can be used to test the true value of the quantity investi- 
gated. For example, consider the determination of a specified physical constant. 





WALSH 


E. 


JOHN 


66 


86 
66 
¢°S6 
96 
16 
C6 
86 
86 


AQ [BULIO N 
Loy AOD 
‘xoiddy 








1b < Tx 
Db < (XY + Tr) % 


Ob < [(eu + ta) ¥  Rxjur 





0 < 
1p < (Re + IW) Fs 


0p < 


jt < 6 4ydavoy 
: poprs-oug 


|  < (te + Tr) & %> > (6a + 8x) & 
 < [(tx + Mr) ° cr] UTU 1p > [(6x + tax)% § 8a]xvw 
 < [(Sx + Mr)% * tx] uTU 1 > [(6x + sa)% ‘ 8a]xvul 
1 < [(ca + Tr)% ‘ tx} ur % > [(Sa -+ Sx)% ta] xvUl 
0 < [9X 4+ Wr)é * barjutU 0 > [(6e + tx) % 6 92]xvUL 
6 < Ww > > &r 
P< (+ W)E 0 > (8x + 4x) % 
0p < [(ta + Hr)€ © ex] urUT % > [(8x + 9x)% 6 t2])xvU 
0p < [(he + Lv) ° cajur 1p > [(8e 4 Sx)% ‘ 92]xvUU 
0p < [(8x + W)% § ev] uTU > (8x + hr)% § 92)xvUl 
0p < Ty 0p Ss Ly 
p< (ie + 12)§ % > (le + %)§ 
Wp < [(fe + Tr)% ° er} urU 0p > [(ta + Sx) % 6 9x]xvUL 
 < [(Fa 4- We) * Sx] ur 1p > [(4u + bx)! 6 Sa]xvU 


0D > 
%> > (9a + sx) % 


i) = [(9x + bx) 


* Sy] xvU 


0 > s7 
| % > (8x 4 12)3 





| > > 


jl %> > $ ydaooy 
: pepis-0ug 


JOy}O JL 4 $ ydaody :ppd1ujamwlig 


$3890. 





$}Sa], 
| Jo JaAOT aouRoOyLUsIg 


eC] > U sof sjsa} aaunaifiubis pooruawmufis pup popis-auo0 vwoy 


T Wav 


8°0 
GI 9°O 
0°% 0°] 6 
€'? GS 
6 Ol ig 
8°0 FO 
9] 8°0 
&'°S Gl $8 
gg L’G 
7 8 oT 
9° 1 8°O 
Ie 91 . 
| Ss ~ 
6°O1 ¢’g 
"a | 
69 l°€ 9 
b°6 i 
Z'9 I°g - 
CSI G9 ° 
CGI o'9 P 
% Yo 
[BOLIJOUIUTAG pepts-9ug¢_ 
u 





67 


SIGNIFICANCE TESTS FOR THE MEDIAN 



































 < [(4x + t)% § Sx]urUT 





QB 





0d — (tr LL Ir\z 





as a Bip on . oe ae 








26 OD < [(Se + x) F * SxjurUT Op > [(a + 4x) § Na]xeur Ol ¢°0 
Op < [(ex + 8x) s “(a + Tx) slur | > [(Ma 4+ Mx); “(ae + %)F]xBUT 0% OT 9 
Op < [(le + ex) § (Ma + tx) Ff) uur Op > [(Mr + %)% ‘(Ma + x) F]xvUl LY E'S 
Op < [(Ha + e)¥ “(sx + Wr)Fjurar | %p > [(Ma + 8x) E “(Ma + Fr) F]xvuL £6 Lt 
Op < [(8u + x)E ‘(Sx + tx) F]urL | DP D> [(Ma + x) F “(Ma + 4x) F]xBUT 01 c’0 
¢° 06 > < [(6x + tr)§ ‘ Sax] urUT > > [(Ma + 9) F © Tr)]xvur 0% OT 5 
Op < [(6x + x) F “(le + tx) Flurur | % > [(fte + %)F “(Ma + 8x) F]xBUl L’? €°Z 
Op < [(lx + e)F (Ma + Te) Fjurur | > [(*e + 8x); ‘(Me + Px) F]xvul b'6 Lt 
o°F6 Wp < [(tx + 1) g «aur Op > [(etx + 42) § a] xeUN or ¢°0 
0p < [(8r + 'x)§ ‘(Se + tx) FUT p > [x + ox) F “(Me + x) F]XVUT 0°% 01 o 
 < [(Su + x) F “(8x + tr) Fur Op > [x + 8) Fe “(HX + 8x) F]xBUT Lt eZ 
0 < [(8x + x) F “(le + Tr) Fur Op > [(e + 8x) F ‘(Ha 4+ bx) F]xvur r'6 LY 
Op < [(ra + *x)& “(9 + Tr) e] UT %p > [(ole +4 Sa)F “(etx + 4x) F]xvUl I'l G‘¢ 
¢° $6 0 < [(e + Tr) § 6 rjurU p> > [(x +4 9x)e ‘ 6x] xvul 0°% 01 | 
Dp < [(tx + tr)% 6 Sx] UT > [(*x + Sx)% 6 8x] xvur QF T'S él 
Op < [(8e + Faye “(Ox + 'r) F]uTUT Op > [(Me + Sx) F (a + bx) F]xvUT b'6 i't 
16  < [(%e 4+ Mr)% * Sar] uruT Op > [(Hx + 4r)% ¢ 6x]xvur I'l C0 | 
Op < [(rx + tx)% “(9x + Tr) Z)UTUT % > [(6x + 8x)% “(Me +4 9) %)xvU I°Z ‘ee 
68 0p < [(ee + Tr) % © Sxjuru Op > [(Ha +4 sx)% 6 ta]xeu 9°g Q'S iT 
0 < [(8x 4+ tx)% ¢ Sx}uTU > [(Ma + twr)% ‘ tr]xvuT 16 SF | 
¢°96 0p <= [(%x + tr) % : ey] UTUL % > [ (tx -t- 9x) % ‘ 6y|xvul 0° [ G'() | 
$°96 % < [(te + ta) § ¢ exjut % > [(olw + ox) g ¢ sx]xeur °z Mm i+ 
€6 p < [(9x 4+ tWr)% f ajurU > [(lr 4+ Sx)% 6 A2]xeUL [°¢ Gz si I 
¢°28 > [(%r + br)% f 9r]xvU I'll 9°¢ | 





68 JOHN E. WALSH 


Various scientists obtained experimental values for this constant by several differ- 
ent methods. If it can be assumed that each value is an observation from a 
continuous symmetrical population and that all the populations have the same 
median, the true value of the physical constant can be tested by applying the 
order statistic tests to the totality of the experimental values. 


3. Power efficiency of tests. A problem which arises throughout the paper 
is that of determining how much information is lost by using some other test in 
place of the most powerful test of a given hypothesis. The quantitative measure 
of the amount of available information which is used by a test will be given as a 
percentage and is called the power efficiency of the test considered. 

In all cases investigated the underlying population is normal with unknown 
variance and the hypotheses tested concern the population median (mean). 
Then the most powerful test (one-sided or symmetrical) is the appropriate 
Student t-test. 

The procedure used to measure the power efficiency of a test is different from 
the common method of measuring the efficiency of an estimate. The efficiency 
of an estimate is obtained by taking the ratio of the variance of an efficient esti- 
mate with respect to the variance of the given estimate (expressed as a per- 
centage). The method of determining the power efficiency of a test, however, 
consists in continuously varying the sample size of the appropriate most powerful 
test (same significance level) until the power functions of the given test and the 
most powerful test are equivalent in the following sense: The area between the 
two power curves for which the power function of the most powerful test exceeds 
the power function of the given test is equal to the analogous area for which the 
power function of the most powerful test is less than that of the given test. (It 
is assumed that the power functions of the tests can be made to depend on the 
values of a single parameter.) The sample size (not necessarily integral) of the 
most powerful test with equivalent power function divided by the sample size of the 
given test is called the power efficiency of the given test (expressed as a percentage). 

In obtaining power efficiencies in the manner defined above, the sample size 
of the most powerful test is allowed to assume non-integral values. This fur- 
nishes an interpolated measure of the same size of the most powerful test which is 
power function equivalent to the given test. As pointed out above, the t-test 
is a most powerful test for the situations considered in this paper. A method of 
computing power function values for t-tests having non-integral sample sizes is 
given below. 

The definition of power efficiency selected is very convenient from a computa- 
tional point of view. Power function values for the t-test can be easily computed 
through use of the normal approximation given in [2]. For the significance levels 
considered in this paper, the normal approximation is reasonably accurate if 
the sample size is not too small. In the remaining cases the approximation 
underestimates some power function values and overestimates others. For the 
situations investigated, however, the error introduced by this combination of 





TABLE 2 
Efficiencies and power function values for certain order statistic tests 





| | 
| ae ee | V f Pow 
Approx.| Signif- | > = 
Effi- | icance 
ciency | Level 


| Sample 


Significance Te : 
8 st Size 





1s = 6|5=12/5=18 























| 
| | % | | | 
| 4.9 | | .0625 | .387 | .755 | .964 
(as + as) < do | 5 | 98 | .0625 | .343 | .755 | .958 
| 5.82 | | 0469 | .827 | .779 | .980 
max[zs,3(a1 + x26)] << | 6 | 97 .0469 | .334 | .779 | .972 


























| 0312 | .244 | .682 | .951 











































































































































































































t | 5.88 | | | 
2(t5 + 16) < do | 6 | 98 .0312 | .254 | .687 | .942 
| | 6.65 | 0547 | .406 | .869 | .994 
max([zs5 , (a4 + 27)] < 7 95 .0547 | .413 | .867 | .991 
{ | 6.85 | 0234 | .239 | .716 | .969 
max(xs , 3(as + 27)] < ¢0 7 98 | .0234 | .249 | .717 | .962 
| | 7.55 .0430 | .395 | .882 | .996 
max[z¢, 3(t1 + t8)] << | 8 94.5 | .0430 | .404 | .879 | .993 
! 7.85 0117 | .174 | .650 | .956 
max|[27 , 3(%- + 2s)] < do 8 98 .0117 | .185 | .656 | .949 
{ 8.64 0215 | .302 | .839 | .994 
max[z7, 3(%s + 29)] < do 9 96 0215 | .311 | .834 | .990 
{ 8.9 .0059 | .127 | .597 | .947 
max(zs , 3(rz + 29)] < do 9 99 .0059 | .137 | .599 | .935 
i 7.5 0547 | .450 | .910 | .998 
ts < do 10 75 | .0547 | .454 | .901 | .995 
9.65 .0107 | .227 | .790 | .991 
8.2 .0098 | .176 | .668 | .964 
max[29, 3(%1 + Xw)] < do 10 | 82 .0098 | .191 | .677 | .952 
{ 8.9 | 0059 | .141 | .621 | .954 
ia Ch 11 | 81 | 0059 | .152 | .634 | .942 
11.22 | | 0102 | .277 | .870 | .998 


max|t9, 4 (26 — X12)| < do | 12 | 93.5 .0102 .288 .862 .995 





69 








70 JOHN .E. WALSH 


underestimation and overestimation tends to cancel out in the determination of 
power efficiencies if the above area definition of equality of power functions is 
used. Thus application of the normal approximation yields reasonably ac- 
curate power efficiencies for the cases considered in this paper. Use of the 
normal approximation furnishes an easily applied method of obtaining power 
furtction values for t-tests having non-integral sample sizes. 

Table 2 contains examples of the above described method of determining power 
efficiencies. Here the power function values for the t-test were computed using 
the normal approximation. Examination of Table 2 shows that the maximum 
difference between corresponding power function values for the two types of 
tests is small for all the cases considered there. This holds in the determination 
of all the power efficiencies listed in Table 1. 

Investigation indicates that the definition of power efficiency given here is for 
all practical purposes the same as that given in [3]. 

For the situations considered in this paper, it is sufficient to restrict power 
efficiency investigations to one-sided tests. Every symmetric test investigated 
can be considered as a combination of two non-overlapping one-sided tests, 
each having a significance level equal to half that of the symmetric test. Also, 
from symmetry, these one-sided tests (each considered as a separate test) have 
the same power efficiency. Thus it is an immediate consequence of the definition 
of power efficiency that the symmetric test has the same efficiency as each of the 
corresponding one-sided tests at half the significance level. 


PART II. DERIVATIONS 


4. Introduction. The purpose of the remainder of the paper is to present 
derivations of the significance test results stated in sections 1 and 2. The first 
derivations consist in obtaining confidence intervals for ¢ on the basis of condi- 
tions (A). Then properties of these confidence intervals are analyzed. Applica- 
tion of the confidence intervals and their properties to significance tests furnishes 
many of the results stated in sections 1 and 2. The remaining derivations are 
concerned with efficiencies and the general method mentioned in section 2. 


5. Derivation -f confidence intervals. Let us consider n independent ob- 
servations, each observation being drawn from a possibly different population. 
Denote these observations by 1, --: , yn and let the cdf of y; be given by F,, 
(¢ = 1,---,m). Furthermore let the n populations from which these n ob- 
servations were drawn satisfy conditions (A). Then 1) of conditions (A) re- 
quires that each F; is continuous, while 2) and 3) stipulate that 


[. wa-6 =] wa-6, G=1,--5%), 


for all values of c in the interval -~ <c < ~. 
Let %1,°-** , @n represent yi, -°-°- , Yn arranged in increasing order of magni- 
tude. Since the cdf’s are continuous, Pr(x; = x;;7 # j) = 0. For the situa- 





SIGNIFICANCE TESTS FOR THE MEDIAN 71 
tions treated in this paper, it is sufficient to consider one-sided confidence inter- 
vals for ¢. All one-sided confidence intervals derived have one of the forms 

g(x , vee » td < 9, 
h(a, oo* 5 Bn) > 9, 


(1) 


where g and h are Borel measurable functions of 7, , --+ , Zn such that 
Prig(ai,--+ 5%) < ) = Pr(g(ai — $, +++ ,2n — ¢) < O}, 
Prih(ai, +++ ,%n) > o] = Prih(a — 6, +++ ,2%n — &) > O). 
Consider the additional condition 
(B) All populations are the same. 


In terms of cumulative distribution functions, condition (B) requires that all 
the cdf’s F; are equal to some cdf F. A theorem will be proved which shows that 
all confidence intervals of the forms (1) derived on the basis of both conditions 
(A) and (B) are also valid if only conditions (A) necessarily hold; i.e. if 


Pr{[g(x1,°+* 5 Xn) < ¢] = 


whenever 21, °*: , Xn are order statistics of observations from populations satis- 
fying conditions (A) and (B), then this probability expression also has the value 
p if 41, °° , 2, are from populations necessarily satisfying only conditions (A). 
Similarly for Pr{h(ai,---,2%n) > 9). 

THeEeorEM 1. Let Q(1 — $,--: , 2n — ) be a probability statement involving 
a1 — o,°** 5 Xn — >, which defines a Borel measurable region R(a1 — $, +++ ,2n —¢) 
of the n-dimensional order statistic space. If 


(2) Q(r — ¢,°*+,%n — 6) =D 


whenever 21, °°* , Xn are order statistics of n independent observations from popula- 
tions satisfying conditions (A) and (B), then (2) also holds when 21, -+-- , Xn are 
order statistics of n independent observations from populations necessarily satis- 
fying only conditions (A). 

Proor. It is sufficient to consider the case in which ¢ = 0. Then, if condi- 
tions (A) are satisfied, the joint probability element of 2, --- , 2, is 


dF(a,-°° : tn) = z dF \(xe)) 2+ dF (tr), 
where the summation is taken over all permutations z of the integers 1, --- , n, 
and F’s are cdf’s of symmetrical populations with zero median. Let R = 


R(x, +++ , Xn) be the region of the n-dimensional order statistic space defined 
by the probability statement Q(a, +--+, %). Then Theorem 1 stipulates that 


(3) [ aFGa, +++, 20) =p 








72 JOHN. £. WALSH 
whenever y1, °°: - Yn are from populations satisfying conditions (A) and (B) 
with zero median. In this case, however, each F; = F and (3) becomes 
(4) nt | ILar@, = p, 
R f=] 


where F is the cdf of a population satisfying conditions (A) and (B) with zero 
median. Let 


p= {I (= aF (c.)) 


i=1 \j=1 


and define S%, to be the sum ct all terms in the expansion of P which contain a 
specified a of dF; , --- . dF, and no others; the particular set chosen is denoted 


by 6, where 8 = 1,-::-, ("). Then 
P= Kiar, +++ tn) + Sra +--+ + LS. 


Now consider any given S®% (i.e. a, 8 given). Define dH to be the sum of the 
a of dF, ,--- , dF, pertaining to 8 plus any set of zero or more of the remaining 
dF’s. Then no matter which of the remaining dF’s are chosen for dH, the sum 


of those terms in the expansion of I] dH (a1) which contain the particular set of 
i=1 


a of dF,, +--+ , dF, is always equal to S&. Let 
Pa = (II agate), 
6 \i=l 


where dG®, equals the sum of the a of dF; , --- , dF, pertaining to 8. Then from 
the above and the symmetrical fashion in which the dF’s are treated, 
Po= 2+ KS DSi t + Ki? US, 
3 B 8 
where the K‘” (u = 1, --- ,a@ — 1), are constants. 
Consider the case in which a = n — 1. Using the above expression for P. , 
r = dF (xy, a » Bad a P,-4 
+ — KG) L Sha t+ - KM) DS. 


Repeating this procedure successively fora = n — 2,n — 3, +--+ , 1 shows that 
GF (a1,°°>, tn) = P+ CauPait--: + GP, 
where the C,, (v = 1, --- ,n — 1), are constants. 


Since each F; is the cdf of a symmetrical population with zero median, 


Boy 1 ° — 
G./a = q (sum of the a of F;, --- , Fn pertaining to B) 


ee 


SIGNIFICANCE TESTS FOR THE MEDIAN 7 73 


is also the cdf for a continuous symmetrical population with zero median. But 
P n 
P. = a’ E = q" II dG? (zx;) ‘a). 
a” B i=1 


Hence dF (x, +--+ , 2n) is equal to a sum of terms (multiplied by certain con- 
stants) of the form 


n! [] dF(x,), 
t=1 
where F is the cdf of a continuous symmetrical population with zero median. 
Thus from (4) and the linear properties of the integral, 


[ are, ++, 20 =p 
R 


if y¥1, °°: , Yn are from populations necessarily satisfying only conditions (A). 
Q.e.d. 

Next confidence intervals of the forms (1) will be derived for ¢ on the basis of 
conditions (A) and (B). Before stating the theorem on which these confidence 
intervals are based consider the following definition of notation: For each per- 
missible selection of 7 and j, the symbol 


tt, J} (<i<j<n) 


denotes an arbitrary but fixed selection of one or both of the inequality signs 
<, >. The selection of both inequality signs, denoted by S, has the interpre- 
tation 


1S G9=3-07 <4%< @ 


(ai + 2;)/2 S$ P6=—e < (& + 2;)/2 < &. 


It is to be noted that {r, s} is not necessarily equal to {z, 7} unless r = 7 and 
S = j. 
THEOREM 2. Consider the probability statement 


(5) Priv + 2;)/2 i,j} G51 St Sj < nh. 


Let this statement have the value q if 41, +++ , Xn are order statistics of a sample of 
size n drawn from the uniform population with range —} to 3 (thengd = 0). Then 
(5) also has the value q tf 21, +--+ , Xn are order statistics of a sample size n drawn 
from any population satisfying conditions (A) and (B). 

Proor. Let y1,--°: , Y, be a sample of n values from a population satisfying 
conditions (A) and (B) while 7, --- , 2, are the y’s arranged in increasing order 
of magnitude. Then there is a monotone function z (see [4]) such that z(z) will 
have the same cdf as y; — ¢ if z is from a uniform population with range —} to }. 
Since the y’s are from a symmetrical population, —7(z) = r(—z). Letz;—¢= 
m(z;), (@ = 1,---, n), define the z;. Then 








74 JOHN -E. WALSH 


Pr{(xi + x;)/2{t, j}o) = Pri(w(e:) + w(z;)){2, 730] 
Pr[r(z:) {t, J} — 2(z;)]. 
From the monotone and symmetrical properties of the function z, 
Pr[(zi){t, 7} — w(z;)] = Priw(e){z, j}4(—2;)] 
Prizitt, j} — 2). 
By hypothesis this last expression has the value q, thus completing the proof. 
Many of the probability statements of the form (5) have zero probability. 


For example, Prix; > ¢, x2 < ¢,-:: ] = 0. Also many selections of the symbols 
{z, 7} result in equivalent probability statements. For example 


Pr(a S $, t% < o) = Pr(m < ¢, t2 < 9). 


An immediate consequence of Theorem 2 is that one-sided confidence inter- 
vals can be obtained for ¢ by choosing any specified subset of (x; + 2;)/2, 
(1 <i<j <n), and considering an arbitrary but fixed order statistic of the 
values of this subset. For example, consider the subset consisting of z,_; and 
(tn-2 + Xn)/2. Then 


Pr{max[tn-, (tn-2 + In)/2] < o} = Pr{(xi + 2;)/2{1, j}4), 
where 

i} (< ifeithri=j=n—1; ori=n—2, j=n; 

_ ls otherwise. 

In general, the confidence coefficient of any one-sided confidence interval 
formed by considering a certain order statistic of a specified subset of (x; + 2x;)/2, 
(1 < i<j <n), can be expressed as a sum of probabilities of the form (5), 
where {7,7} = S if (7; + x;)/2 is not included in the specified subset, (¢ < 7). 

It is usually preferable to select the subset of (x; + x;)/2,(1 <i <j<n), 
in such a way that no two of the elements chosen necessarily have an order 
relation. 

Satisfactory two-sided confidence intervals can usually be obtained as combina- 
tions of one-sided confidence intervals. 

6. Confidence coefficients. The purpose of this section is to show that all 
the confidence coefficients for one-sided confidence intervals derived on the basis 
of Theorem 2 are of the form r/2”", (r = 1,--- , 2” — 1). Also a method of 
determining confidence coefficient values for one-sided confidence intervals is 
developed. 

First a theorem will be presented which shows that each of the one-sided con- 
fidence intervals derived in the preceding section has a confidence coefficient of 
the form r/2", (r = 1,--- ,2"— 1). On the basis of Theorem 2 it is sufficient 
to prove: 


SIGNIFICANCE TESTS FOR THE MEDIAN 75 


THEOREM 3. Let 21, -+- , tn be the ordered values of a sample from the uniform 
population with range —4 103%. Then 
Pri(xi + 2;)/2{t, j} OF 1 Si Sj <n) = 7/2" 
where r has one of the values 0,1, --- , 2". (The symbol {i, 7} is defined in section 
5). 
SKETCH OF PRoor. This theorem is proved by investigating how the hyper- 
planes 


Hai + 24) = 0 @ 


intersect the n-dimensional order statistic space for the particular population 
considered. It is found that each relation of the form 


St+<sjcn), 


2(xi + aj) {t, 3} 0, @Qd<st*<oj2n) 


defines a region of the n-dimensional order statistic space which consists of a 
certain number r of n-dimensional “basic” cells each of which has an n-dimen- 
sional, ‘“‘volume”’ equal to (3)”. A detailed proof of this theorem is given in 
[5]. 

Next a method will be developed whereby confidence coefficient values can 
be determined for any one-sided confidence interval of the form 


For this purpose it is sufficient to derive a procedure for determining the con- 
fidence coefficient of any confidence interval of the form 


(7) max [certain subset of 3(7; + 7;);1 << i<j <n] <¢@. 


The confidence coefficient of any one-sided confidence interval of the form 
min [] > ¢ can be obtained by symmetry. The confidence coefficient of any 
other one-sided confidence interval of the form (6) can be found by expressing 
the value of 


Pr [3(ai + 2;) {i,j} $] 


as a sum of terms of the form Pr{max [] < ¢} or as a sum of terms of the form 
Pr{min [] >}. That this is always possible for one-sided confidence intervals 
of the form (6) is shown by direct application of the results of page 17 of [6]. 

It is not difficult to show that any one-sided confidence interval of the form 
(7) can be expressed in the form 


max {z(n — k), 3[a(n —k +1) + 2(n-—-m—k + 1)],-°-, 
3[z(n) + x(n — m)]} < ¢, 


where 


x(t) = ay (7 = ms *** 5 n), 








76 JOHN E. WALSH 


and m,,--- , m, are k integers such that 
n>m>m>-::->m> Od”. 


This is done by choosing k, m , --- , m, so that the two confidence intervals are 
equivalent. 

Thus it is sufficient to prove the following theorem: 

THEOREM 4. Let x(1),--- , x(n) represent the ordered values of n independent 
observations drawn from populations satisfying conditions (A). Choose a set of k 
integers m,,--- , mz such that 


n>m>m>-*- >m> 0. 
Then the one-sided confidence interval 
max {x(n — k),3[x(n —-k +1) +2e(n—m—-—k+1)],-°-, 
(8) 
2[z(n) + x(n — ™)]} < 4, 


where a term of the form 3[x(n —h +1) +2(n —-m—h+1)],(h = 1,---,h&), 
is to be deleted if? n — m, — h + 1 = 0, has the confidence coefficient 


x] tm + Som — a) + SS mi) 


€o=1 t,)=1 
(9) Mk Mko1~tk-1 momto—: + toy 
Em =e ea]. 
fe-1=1 =te_o=1 i;=1 


SKETCH OF Proor. It is sufficient to consider the case in which the n observa- 
tions are a sample from the uniform population with range —} to } (then @ = 0). 

Let us consider the region of the n-dimensional order statistic space defined by 
(8). This region can be considered as an intersection of n-dimensional regions 
each of which is completely defined by a certain region in an 2; , 2; plane 
(l1<i<j<n). Also the n-dimensional “volume” of this region equals the 
value of the confidence coefficient of the confidence interval (8). 

By Theorem 3, the intersection region of (8) consists of a certain number of 
“basic” cells, each of n-dimensional ‘‘volume” (3)”. Theorem 4 is proved by 
developing a method for finding the number of “‘basic”’ cells in this intersection 
region on the basis of the corresponding regions in the x; , x; planes. It is found 
that the intersection region consists of 


mM}: Momto—++-—tky 


mee Ro fa ee 
t.h1™ ?—— 
“basic” cells. A detailed derivation of this expression is given in [5]. 

Now consider some examples of the application of Theorem 4. Let n = 11, 
m, = 11, m. = 5,m3; = 2. Then, by Theorem 4, the one-sided confidence inter- 
val 

max [1g , 3(v + 27), 3(x10 + 25)] < 


2 For the trivial ease in which k = n the value of (9) is unity. 





SIGNIFICANCE TESTS FOR THE MEDIAN 77 


has a confidence coefficient equal to 103/2". If n = 12 instead of 11, the con- 
fidence coefficient would be 103/2” while the confidence interval becomes 


max [x , 3(10 + 2s), 2(@n + 2%), 3(t2 + M1)] < ¢. 
As another example, let n = 11 and consider the confidence interval. 
Max [zs , 3(%o + X72), 3 (x10 + Xs), 3(tn + 4) < ¢. 


Here k = 3 and comparison with (8) shows that this confidence interval satis- 
fied Theorem 4 with m, = 7, mz = 5, m3; = 2. Thus it has a confidence coeffi- 
cient equal to 51/2”. 

Theorem 3 shows that each one-sided confidence interval developed on the 
basis of Theorem 2 has a confidence coefficient of the form r/2", (0 < r < 2”). 
The question arises as to whether the one-sided confidence intervals defined by 
Theorem 4 have confidence coefficients which attain each of the values 1/2”, 
2/2",---, (2" — 1)/2”. That this is not the case is proved as follows: The 
totality of different confidence intervals of the form (8) is equal to 2" — 1. This 
is shown by counting how many ways the integers m , --- , m, can be selected 
subject to the conditions n > 1m > m>--- >m, > 0. It iseasily seen that 


there are ( “§ possible ways. Summing over the possible values of k yields 


2" —1. This figure is increased to 2” if the confidence interval x, < ¢ is also 
included. Examination of (9) shows, however, that two different selections 
of m,, m2, etc., will result in the same value of (9) for more than one case. 
Thus the one-sided confidence intervals of Theorem 4 do not have confidence 
coefficients which attain each of the values 1/2”, --- , (2" — 1)/2*. 

Although the class of one-sided confidence intervals defined by Theorem 4 do 
not have confidence coefficients which attain each of the values 1/2”, 2/2”, ---, 
(2" — 1)/2”, they do have another property which is important from a practical 
point of view: If a certain confidence coefficient can be obtained for a particular 
value of n, then this confidence coefficient can also be obtained for all greater 
values of n. This result is a consequence of the following theorem: 

TueoreM 5. Let x(1), --- , x(n) be the ordered values of n independent observa- 
tions drawn from populations satisfying conditions (A). Then if a confidence in- 
terval of the form (8) has the confidence coefficient ¢« for a certain value no of n, it is 
always possible to obtain another confidence interval of the form (8), which has the 
confidence coefficient ¢ for the value no + 1. 

Proor. Letm,--- ,m be the integers corresponding to the given confidence 
interval of form (8). These integers satisfy the condition 


Nn > m > mM > --- m> 0. 


Let no be replaced by mo + 1 and consider the new set of integers (m, + 1), 
(m2 + 1),---, (m+ 1), 1. Evidently 


mt1loeomt1l>-:::>mt+ili>i>o. 





78 JOHN “E. WALSH 


Hence these integers can be used to define a confidence interval of the form (8). 
Also it is easily verified that 


m2otl 


1+ (m+) + 2 (m +1 - a) 
+1= 
my+1 mo+1—ig—-> iz 
+e t+ DO. Zz (my +1 — it) — +++ — te) 
‘p71 ip=1 
1 my—-1—iz motl—i2—-+++—tz 
+> D> .:-: pm (m —1—%— +--+ — &) 
seul tp_yl 4,=1 
m2 Mk 
= 21+ m+ dX (m, — t1:) +--+ + it pan 
$1= %k-1™ 
Mon~to—-++—8h01 
x (m —t— es — in| . 
3\= 


Thus the new confidence interval has the same confidence coefficient as the given 
confidence interval. 
From symmetry considerations, the one-sided confidence interval 


min {x(k + 1), 2[e(k) + (mm + &)], +++ , 2[z(1) + xm + 1)]} > 4, 


where a term of the form 3[x(h) + x(m, + h)], (h = 1, --- , k), 1s to be deleted 
if m, + h = n + 1, has the same confidence coefficient as the one-sided con- 
fidence interval (8); i.e. its confidence coefficient is given by (9). 


7. Efficiency of some tests based on conditions (A). Let us consider the case 
in which the n observations used for a test are a sample from a normal population 
with unknown variance. The purpose of this section is to investigate the effi- 
ciency of some tests based on conditions (A) for this special case. 

The method used to obtain efficiencies is outlined in section 3. Only one-sided 
and symmetrical tests are considered. For this purpose it is sufficient to limit 
investigations to one-sided tests of 6 < qo. 

If the subset of (x; + 2;), (1 <i <j < n), chosen for a test is not of one of 
the forms 


(a) *a; 
(b) 3(x; + 2;), (@ <9); 
(c) Uj, 3 (x: + Lk), (a <j < k), 


the determination of power function values requires a numerical double or higher 
order integration. Such numerical integrations are extremely lengthy. For 
this reason only one-sided significance tests based on subsets of the forms (a) — (c) 
will be investigated. 
Let the normal population have variance o° and consider one-sided tests of 
@ < ¢» based on subsets of the form (a). Then 








SIGNIFICANCE TESTS FOR THE MEDIAN 79 


Power Function = Pr (x; < ¢o) 


= Pr(Ho# <#=#)- Yea — oy, 


where 
b= o- d/o, NO = sel ay. 


The power function values listed for the test x; < ¢o in Table 2 were computed 
from the above expression. The corresponding values for the t-test were com- 
puted from the normal approximation given in [2]. 

For subsets of forms (b) and (c) the expression for the power function is more 
complicated and will not be either derived or stated here. For any particular 
case, however, a simple analysis will yield an expression for the power function 
which requires only a first order numerical integration. General expressions 
for the power functions when the subsets are of the forms (b) and (c) are stated 
and derived in [5}. 

Table 2 contains power function values and efficiencies for several tests based 
on subsets of the forms (b) and (ce). The power function values were computed 
by approximate integration (Simpson’s rule, etc.). The t-test power function 
values were obtained by using the normal approximation. The power efficien- 
cies listed in Table 1 for tests which do not appear in Table 2 were computed in 
[5], where a table of power function values is also given. 

Examination of Table 2 shows that many of the tests formed from subsets of 
types (b) and (c) are very efficient for small values of n. The efficiency appears 
to decrease as n increases. Also the efficiency of a test depends strongly on the 
subset of (4; + 2;), (1 < 7 <j < n), used to form the test. For example, 
let n = 10. The test 


Accept @ < oo if max [x9 , $(%1 + X10)] < 0 


has a significance level of approximately .01 but an efficiency of only 82%. 
However the test 


Accept @ < ¢o if max[xg , 3(ae + 210)] < 0 


also has a significance level of approximately .01 but an efficiency of 96.5% 

An approximate set of rules for picking subsets which result in efficient tests 
of @ < do is suggested by the results of Table 2. Let x(a), --- , x(t) be the 
order statistics which make up the elements of the particular subset of 3(2; + 2;), 
(1 <i<j <n), to be used for the test. The approximate rules are 

1. Use the maximum of the values of the elements of the subset. 

2. Choose 7;, +--+ , 7, so that max(i,--- , i-) = n and min(z,--- , 7,) is as 
large as possible subject to the restriction that the test is to have a signifi- 
cance level of a specified order of magnitude. 

Symmetry considerations furnish the corresponding set of rules for obtaining 
efficient tests of 6 < gp. 








80 JOHN E. WALSH 


Other tests at approximately the same significance levels but not based on sub- 
sets of the forms (a)-(c) are undoubtedly more efficient than many of the tests 
considered in Tables 1 and 2 (particularly for the larger values of n). Computa- 
tional difficulties, however, prevent consideration of more general situations. 


8. A general solution.? A general method of obtaining one-sided tests of 
@ < gd and ¢ > ¢, also symmetrical tests of 6 ¥ ¢ , on the basis of conditions 
(A) is the following: 

Let y:, °° , yn be nm independent observations drawn from populations satis- 
fying conditions (A). Let 


z= Yi — do (( = 1,-++-, Mn). 


If the null hypothesis of ¢ = @p is satisfied, each z; is an observation from a popu- 
lation satisfying conditions (A) with zero median. Consider the 2” sets of values 
obtained by the transformations 


Zi > €(2)2, (Gj =1,-++,n). 


where e€(z) is one of the signs + or —. Form the mean of each of the 2” sets of 
values. Then it is readily seen, from conditions (A), that the probability that 
2(= 2 2z;/n) is less than the (r + 1)th largest of the 2” means has the value 
r/2” when the null hypothesis is true. Similarly the probability that Z is greater 
than the (2” — r)th largest of the 2” means is equal to r/2” if the null hypothesis 
of é = @p is satisfied. Thus the test 


Accept o < do if Zis less than the (r + 1)th largest of the 2” means. 


is a one-sided test of 6 < ¢o with significance level equal to r/2”. Likewise the 
one-sided test 


Accept @ > do if Z is greater than the (2” — r)th largest of the 2” means. 
has the significance level r/2”. Consequently the symmetrical test 


Accept ¢ ¥ ¢o if Z ts either less than the (r + 1)th largest or greater 
than the (2” — r)th largest of the 2” means. 


has a significance level equal to 2r/2”. 

The application of any of the above tests requires the computation of the 2” 
means and a determination of where Z falls in the ordering of these means, If 
n = 5, only 32 means need be computed. Ifn = 10, however, 1024 means must 
be computed. Evidently this test is too cumbersome to apply except for very 
small values of n. 


9. Acknowledgements. The author would like to express his appreciation to 
Professors 8. 8. Wilks and John W. Tukey for valuable advice and assistance in 


’ This solution was derived independently by E. J. G. Pitman and the author. The fun- 
damental idea on which the solution is based was presented by R. A. Fisher in [7]. 


re 


SIGNIFICANCE TESTS FOR THE MEDIAN 81 


the preparation of this paper, also to Mrs. Ruth S. Shafer for computational as- 
sistance. 


REFERENCES 


[1] Joun E. Watsu, ‘‘Applications of some significance tests for the median which are valid 
under very general conditions,’’ submitted to Am. Stat. Assn. Jour. 

[2] N. L. Jounson anp B. L. Wetcn, “‘Applications of the non-central t-distribution,”’ 
Biometrika, Vol. 31 (1940), p. 376. 

[3] Joun E. Watsu, ‘‘On the power function of the sign test for slippage of means,’’ Annals 
of Math. Stat., Vol. 17 (1946), pp. 360-361. 

[4] H. Scuerr&é anp J. W. Tuxey, ‘‘Non-parametric estimation. I. Validation of order 
statistics,’ Annals of Math. Siat., Vol. 16 (1945), pp. 187-192. 

[5] Joun E. Watsu, “‘Some significance tests for the median which are valid under very 
general conditions,’’ unpublished thesis, Princeton University. 

[6] G. Upny YuLE anp M. G. KEenpbatu, An Introduction to the Theory of Statistics, Griffin 
and Co., 1947. 

[7] R.A. Fisner, The Design of Experiments, Oliver and Boyd, 1942. 








A DIRECT METHOD FOR PRODUCING RANDOM DIGITS IN ANY 
NUMBER SYSTEM 


By H. Burke Horton Aanp R. Tynes Smita III 


Interstate Commerce Commission 


1. Summary. A compounding technique first used to produce random binary 
digits is generalized and extended to other number systems. Formulae for the 
rate of convergence of probabilities to the desired values are derived. The 
method is extended to the production of random digits with fixed but unequal 
probabilities. Numerical results are presented in summary form together with 
results of tests applied to a set of random digits produced by the method. 


2. Introduction. In a note [1] by one of the authors a method of producing 
random digits was presented. The method was based upon a process, designated 
“compound randomization,” used to produce random binary digits, which can be 
converted to random digits in other number systems by simple methods. De- 
spite the ease of converting a random binary series to another system, it is of 
interest to examine the problem of direct production of random digits in any 
number system. In the course of producing random binary digits with machine 
tabulating equipment, and while designing an electronic device to produce ran- 
dom binary digits, it was noted that the multiplication process described in the 
earlier paper was the equivalent of addition modulo 2 of a series of binary digits. 
This observation laid the basis for generalizing to other number systems.’ 


3. Initial conditions and notation. Let us assume that there is available a 
source of digits, 0, 1, 2, --- (n — 1), in a number system of base n, where n is a 
positive integer, n > 1. Let p,. represent the probability of obtaining the rth 
digit in the sth trial. Assume that initial conditions can be controlled so that 
the trials are independent” and 


(3.1) Prs Ze 


where 0 < € S 1/nisa fixed positive number. (It may be noted at this point 
that conventional “single-stage” methods of producing random numbers are 
based upon the assumption that p,, = ¢€ = 1/n.) Let 7,s represent the prob- 
ability of obtaining the rth digit by addition modulo n of the digits obtained in 
s individual trials. In order to express 7,, in terms of p;, , consider two sets of 
matrices whose elements are defined as follows: 


1 In acting as referee for [1] Dr. George W. Brown suggested generalizing to other number 
systems by addition modulo n. 

2 J. E. Walsh [2] has considered, in terms of conditional probabilities, the effect of inter- 
correlation on compound randomization in the binary system. 


82 


oos 


OO meme US ~ 


—/-— we 9 


\v 


PRODUCING RANDOM DIGITS 83 


| Po,s Pn—-1,8 pa-tea°**°* Pi,s | 
| Pie Pow Pa-is°**** P2,8 \ 
(3.2) GQ, = || Pos Piss Pos ***** D3, 








eevee ee eee eee eee eee eneses 





; ; 
70,8 Ta-i,e Wn—-2,8°°°°° Tie | 

T1,8 T0,s Fe-ie°**** 2,8 

| 

(3.3) se |] See Bie Be ~*~?" 73,8 
Perr rere Lie eC Cer ee | 
({CibbNne ode nee ae eee wee | 
| | 
| Wn-1,s Wn—2,s Wn-3,8°°*** 70,8 | 





Note that a, and a, are Markoff matrices with two additional restrictions: (1) 
there are no zero elements, and (2) column (as well as row) sums are unity. Each 
n X n matrix is made up of only n distinct elements, namely, the n different 
probabilities associated with the sth trial for a, , or the n different probabilities 
associated with the sum of s trials for a, . 


4. Relation of 7,,to p,,.. Assuming independent trials, we have the following 
relationships: 


a1=— a; 
2 = W2°% = H°Q; 
(4.1) Q3 = A3* Ae = As: Me: A; 


eevee ee eee eee ee ee eee eee 
eeeeerer eee eee e ee eee eee ee 


k 
Ak = Apa = [I a,. 
s=1 


Thus, since any row (or any column) of a, is a permutation of the 7, , by (4.1) 
the 7, are expressed in terms of the individual probabilities, p,, . 


5. Convergence of 7, to 1/n. (5.01) THeoremM Lim, 7. = 1/n. 

Proor. Let p, denote the range of the elements of a,. Each element of 
a, is a Weighted mean of the n distinct elements of a,;. The distinct elements 
of a, are used as weights in the averaging process. Now the range of a set of 
weighted means (weights > 0) of a set of values must be less than the range of 
the values themselves, unless both ranges are zero. Therefore, since the weights, 
Drs > 0 by condition (3.1), 


(5.02) Ps < ps1 , for ps_1 ¥ 0, or in the special case p;_1 = 0, p. = 0. 
n—1 


Also, since > ts = 1, 
r=0 


3 While this article was awaiting publication, J. Wolfowitz independently proved theorem 
(5.01). 








84 H. BURKE HORTON AND R. TYNES SMITH III 


(5.03) 1/n — pp S tr SI/n+p. 


In order to show that lim,_.. p; = 0, and to derive formulae for the rate of con- 
vergence of 7, to the limiting value, 1/n, let w; represent the ordered p,; for any 
given s: w, = the smallest p,;,--- Wa = the largest of the p,s. In a similar 
manner let 2; represent the ordered 7,,,-1. The following inequalities for the 
maximum and minimum 7,,; can be set down immediately: 


(5.04) MAX Tre S Wa'Tn + Wea Xn- He + WN; 
rT 

(5.05) MIN 7 ps ] Wn'X1 + Wn-a*®o fees + Widn. 
r 

And since p, = max ty; — MiN Tre, 
Tr ? 


(5.06) ps S Wn(Xn — 2X1) + Wa—1(Xn—-1 — 22) + ees + welte — Ln) + wi(r1 — Ln). 
For n even, let m = n/2 + 1, then by regrouping terms, 


Pe S (Wa — W:1)(An — 21) + (War — We)(Zn-1 — Ta) + °° 
(5.07) 


+ (Wm — Wm—1) (Lm ae Lent). 


Noting that pa = (ta — %1) = (Xn-1 — 22) 2 ++ = (Xm — Lm), the following 
substitutions can be made: 


(5.08) ps S (Wa — Wi) pa + (Wai = W2) Prt + ose + (Wm = Wm—1) Pst . 
For compactness, this may be written, 

n m—1 
(5.09) Ps S P Wi a | * Ps—1- 


Similarly for n odd, let m = (n + 1)/2; proceeding in the same manner as above, 
the median term vanishes, yielding as a final result, 


n m—1 
(5.10) a | Zz, W- > «.| * Ps—1e 


i=m-+1 
For simplicity denote the expression in brackets by 6, ; then 
(5.11) Ps S 5s * Pe~l 5 


where for n even, 6, represents the sum of the largest n/2 of the p,, minus the 
sum of the smallest n/2 of the p,., and for n odd 6, represents the sum of the 
largest (n — 1)/2 of the p,, minus the sum of the smallest (n — 1)/2 of the p,; . 
Continuing the process developed above, we find that 


(5.12) Pe S 5s * Ss-1* Ps—2 5 


(5.13) | a i ae 


PRODUCING RANDOM DIGITS $5 


Since 6; = p., the following simple inequality holds: 


; 
IT 4., 


s=1 


(5.14) Pr 


lA 


Now 6, S 1 — ne, by condition (3.1) and the definition of 6,. Therefore, 
k 
(5.15) lim p; < lim JJ 6, < lim (1 — ne)* = 0, 
k-0 ko s=1 ko 
and (5.01) is proven. In the special case of constant probabilities from trial to 
trial, 5, = 69 , a constant,and (5.14) becomes 
(5.16) pr S (60)". 


Since the mean 7,, is 1/n, we have the following useful inequalities: 


s=1 


k k 
(5.17) i/n —[]& S me S 1/n+ I1&, 
s=] 


in the case of varying probabilities, and 
(5.18) 1/n — (5) *S ma S 1/n + (6)*, 


in the case of constant probabilities. If 5, is not known in each trial, an upper 
bound, 6, may be estimated on the basis of knowledge (including statistical 
tests) of the digit generating process. Then the following inequality will hold: 


(5.19) 1/n — (&)* S mu S 1/n + (&)*, 


where & S (1 — ne). 
It is worthy of note that inequalities (5.14) and (5.15) become equalities ifn = 
2 (binary system), thus, 


k k k 
(5.14b) om = ITs =I |p. -¢.| = I i2p. - 11; 
(5.15b) pr = (0) =|p—q| = |2p—1]". 


These results were obtained by different methods in [1]. 


6. Discussion of results. Certain facts are implicit in the foregoing analysis, 
but are worthy of mention in passing. The compounding process may consist 
of addition modulo n of digits taken from a number of digit-producing machines. 
If any machine, h, is perfect, i.e., p,, = 1/n for all r, each element of the probabil- 
ity matrix a, will be equal to 1/n, and p, = 0. Consequently, each element of 
a;, 8 = h, will be equal to 1/n by (5.17) and the special case of (5.02). Thus 
any combination which contains a perfect machine is perfect. This is equivalent 
to a restatement of Von Mises’ [3] requirement that the sum of a random set 
and any other set must itself be a random set. Furthermore, by (5.02) the re- 
sults taken from any machine, no matter how nearly perfect, can be improved 








86 H. BURKE HORTON AND R. TYNES SMITH III 


by combining with the results of another machine, no matter how biased the 
latter may be. In the limiting case, p,, = 1 (or 0), the probabilities of the vari- 
ous digits are merely interchanged. 


7. Production of random numbers with fixed but unequal probabilities. The 
principles presented above can be adapted to the production of random numbers 
with unequal probabilities as follows: Assume that a set of random digits, 0, 1, 
2, --: (n — 1), is required in a number system of base n, with probabilities qo , 
Gi, G25 °** nt) 2. 1=0 Gi = 1, where each q; is a proper rational fraction which 


may be written as the quotient of two positive integers, gq: = =! Choose m 
as the basis of a new number system, where m is the least common multiple of 
the 2; , 


- Us;  mu;/d; 
Vv; ™m 
A set of random digits, 0, 1, 2, --- (m — 1), in a number system of base m may 


be generated by the process described above, or a set of such digits may be con- 
structed by entering an existing table of random digits, base n, and interpreting 





, ; _ _ mu; 

appropriate numerical quantities, base n, as digit symbols, base m. Since — 
Vv; 

is an integer, groups of digits, muo, mu, --- mun, in the m system may be 


coded as digits, 0, 1, 2,--- (n — 1), in the n system. An upper bound for the 
MU; 





maximum bias of q; will be pi: , Where px is the range of 7, in the m system. 


Thus, by increasing k, the bias of g; can be made smaller than any preassigned 
quantity. 


8. Convergence under more general conditions. Convergence of 7,,; to 1/n 
occurs under a variety of conditions less restrictive than (3.1). 
(8.1) THErorEeM. Jn the case of independent trials, a necessary and sufficient 
- , og {or ; - 
condition that lim 7,, = 1/n is that vt = ¢, where € is a fixed positive number, 
soo \ Wit 
arbitrarily small, and t is a fixed positive integer, arbitrarily large. It is obvious 
that (8.1) is a necessary condition for convergence. To prove that it is a suffi- 
cient condition, consider the following: 
Doe 
(8.2) Lemma. If, “ >» = n, where nis a fixed positive number, arbitrarily small, 
1s) 
then lim tr, = 1/n. 
so 
Proor: Take a fixed integer, h,h 2n—1. Nowany digit, r, can be obtained 
in at least one way; i.e., as the sum of r ones and (h — r) zeros. Therefore, 


(8.3) Th = T,; where t = 7. 


he 


ri- 


PRODUCING RANDOM DIGITS 87 


We now regard h trials as a single trial of a complex machine. Let u represent 
the number of such complex trials. Let ;, represent the probability of ob- 
taining the rth digit as the result of addition modulo n of u complex trials. Then, 
(8.4) lim wy. = Jim a,,~un) = 1/n, 
by (5.01). Now s = uh + 9,0 Sj < h, (gj an integer), or uh S uh + j < 
(u + 1)h. The 7 simple trials cannot increase the maximum bias, by (5.02): 
consequently, 

(8.5) lim Tr(uk+j) = lim Tr(uk+j) = 1/n. 
uo (uh + j)7 0 
Since there is a one-to-one correspondence between the elements of {s} and 
(uh + 3}, 
(8.6) lim a,, = 1/n. 

By a natural extension of the lemma, we may regard ¢ trials as a single com- 

plex trial. Theorem (8.1) thus assumes the form of (8.2). 


9. Numerical results in various number systems. More efficient convergence 
formulae can be devised to meet special conditions. Those presented in (5) 
have the advantages of simplicity and generality. To test the efficiency of 
(5.15) several numerical examples, based upon unusual hypothetical probabilities, 
were worked by matrix multiplication as in (4.1). In these problems p,, = p, , 
a constant, from trial to trial. A tabular comparison of the ranges, computed 
by (4.1), and the upper bounds, determined by (5.15), is presented in Table 1 
for k = 10. 


10. Preparation and tests of a set of random digits. Since an unlimited num- 
ber of valid tests for randomness may be devised, it is obvious that any finite 
set of digits cannot meet all such tests. As a matter of fact a truly random proc- 
ess should yield sets which fail to meet some proportion of the tests, the fraction 
being determined by the level of significance adopted in testing. No finite set 
of digits can be considered random; the tests for randomness are really applied 
to determine the character of the generating process. However, the concept of 
“locally random” sets as developed by Kendall and Smith [4] is useful, and some 
of their tests are used below as evidence that a set of numbers produced by com- 
pound randomization is likely to be locally random. 

A non-random set of 10,000 decimal digits having the relative frequencies 
indicated in the starred line of Table 1 was punched in cards and tabulated. 
Totals were taken for each ten cards and the amount in the unit’s position of the 
counter was cut in a summary card, thereby producing a set of 1,000 digits. 
The frequencies of digits in the derived set are compared with those of the gen- 
erating set in Table 2. The frequencies of the derived set are in accord with the 
hypothesis of equal probabilities. 





TABLE 1 


Comparison of computed range and formula for maximum bias, k = 10 
Hypothetical numerical examples, constant probabilities from trial to trial 








Num Probability in an individual trial | 
ber | ; nee a eee wits P10 (50)1° | do 
base} po | mm | o2 | os | pe | ps | mo | mr | pe | me De | be 
2 |.800 .200 —| —|—, — —| — — | — — | — .0060466176' .0060466176 | .600 
3 |.500 300.200 — —|— — — — | — | — | — |.0000018357 .0000059049 | .300 
3 |.970 .020 .010 — —|— —, —, —| —| — | — |.6616765365  .6648326360 | .960 
3 |.400 .300 .300 — —|—,—|—!—) —)} — | — }.0000000001) .0000000001 |. 100 
| | 
4 |.200 .100 .400 .300 — —|— | —| — | — | — | — |.0000032768) .0001048576 |.400 
| | 
5 _ .200 400.020.3830, — | — — — | — | — | — |.0007878177) .0156833688 | .660 
| | | | 
6 |.080 .240 .360,.020,.200 .100 — | — | — | — _ — | — |.0000168472, .0060466176 | .600 
| | | 
7 |.300).020 .240) 050.130.170.090 — | — — | — | — .0001778804 .0025329516 | .550 
8 |.200'.050 .060 .180 .160 .090 .150.110 — — } — | — .0000000965 .0000627821 |.380 
9 |.030 080) .150 .060 140.090 .190 .050 .210 — | — | — .0000052328 .0005259913 | .470 
10 |.050'.150).200 .050 .050 .120 .080 .020 .180 .100 — | — .0000132662 .0009765625 | .500 
10 |.010 .020,.030,.040 .050 .060 .070 .080 .090 .550 — | — |.0012522218 .0282475249 |.700 
10 |.110 .110 .110,.110 .110 .110 .110 .110 .110..010 — | — .0000000001 .0000000001 |.100 
10 |.150 .150,.150 .150 .150 .050 .050 .050 .050 .050 — — .0000009244 .0009765625 |.500 
10* .014 .171 .164 .184 .023 .095 .047 .205 .089 .008 —  — .0000501840 .0111739516 .638 
12 .010 .070 .120 .160 .050 .020 .090 .040 .080 .110 .060 .190 .0000002256 .0009765625 | .500 


* This badly biased set of probabilities was used to produce the set of random decimal 
digits tested in the next section. 








TABLE 2 




















Digit ;}o/;1}2{/3{/4]5/6]7 s | 9 
Generating set............. .014 .171).164,.184) .023 .095 .047,.205) .089) .008 
BPNO OUR. okie cncccncs 088 .112).086 .105 .113 .102 .101 -098) .097) .098 

Frequency test (derived set) x? = 7.0 P= 63 

TABLE 3 
g (i + 1)th digit 
ith digit | = 5 
0 1 2 3 + 5 6 7 8 9 
0 11 8 7 ( ) 7 12 12 11 8 
1 10 13 15 9 11 14 11 8 10 | ll 
2 11 10 7 10 10 7 6 9 7 9 
3 9 10 3 14 12 17 9 8 11 12 
4 6 12 10 10 19 6 16 14 13 7 
5 9 17 11 14 10 6 5 15 6 9 
6 6 14 9 9 14 10 15 8 6 10 
7 13 10 9 9 8 11 7 12 7 12 
8 7 8 8 12 9 11 14 8 10 10 
9 6 10 7 11 15 13 6 4 16 10 
ft = $6.28 P = .90 
88 


) 


ain en TT TE 


| 0 © | 


Se see aster ETN NE ene 


PRODUCING RANDOM DIGITS 


89 


In the serial test adjacent pairs of digits are tabulated. The distribution of 
these pairs in the derived set appears in Table 3. This test indicates that ad- 
jacent digits are independent. 





Gap test 
Length of gap 
Digit 0-1 | 2-4 | - | | # | P 
Frequencies - 
0 Observed........ 116} 18 fa 42s 1.25 | 75 
Expected........ 16.53 | 19.10 | 13.92 | 37.45 
1 Observed........ a7 =| :27 21 36 | 5.44| .15 
Expected........ 21.09 | 24.37 | 17.76 | 47.78 
2 Observed........ 16 17 10 42 1.90 | .60 
Expected........ 16.15 | 18.66 | 13.60 | 36.59 
3 Observed........ 19 26 18 41 | .90 | .92 
Expected........ 19.76 | 22.83 | 16.64 | 44.77 | 
4 | Observed........ 31 17 20 44 | 7.39 | .06 
Expected........ | 21.28 | 24.59 | 17.92 | 48.21 | 
5 | Observed........ 15 21 15 50 =| 2.04 | .57 
Expected........ | 19.19 | 22.17 | 16.16 | 43.48 | 
6 | Observed........ [o7 |25 «|12 «| 36 | 5.95 | .12 
| Expected........| 19.00 | 21.95 | 16.00 | 43.05 | 
7 Observed........ | 20 19 | 16 42 40 | .93 
Expected........ | 18.43 | 21.29 | 15.52 | 41.76 
| | 
8 Observed........ 14 |19 (|21 | 42 3.27 | .35 
| Expected........ | 18.24 | 21.07 | 15.36 | 41.32 
9 Observed........ is |18 |21 | 40 | 2.53 | .48 
| Expected...’ 18.43 | 21.29 











TABLE 4 




















15.52 








41.76 











e) 


The gap test is based upon the distribution of lengths of intervals between 
given digits. A comparison of the number of gaps of specified lengths and the 
expected number in each case is presented in Table +. The results of this test : 








90 H. BURKE HORTON AND R. TYNES SMITH III 


are also in accord with the assumption of local randomness. Noting the badly 
biased probabilities of the initial set of digits, the results of these tests demon. 
strate the effectiveness of the compound randomization process. 

The use of tabulating equipment for producing random decimal digits by addi- 
tion modulo 10 is relatively fast and simple. The authors have just completed 
production of a set of 105,000 digits in less than two days’ tabulating time. 
75,000 cards, representing approximately 3 months’ receipts of a current carload 
waybill study, were used to generate the digits, 14 non-correlated columns being 
added simultaneously. A chain of length 10 was used, although the nature of 
the initial data was such that a shorter length would probably have given satis- 
factory results. The derived set is now recorded on 1500 cards, 70 digits per 
card. Preliminary tests for local randomness confirm the random nature of the 


generating process. Upon completion of the tests this set will be reproduced in 
tabular form. 


REFERENCES 

[1] H. B. Horton, ‘“‘A method for obtaining random numbers,’’ Annals of Math. Stat., Vol. 
19 (1948), pp. 81-85. 

[2] J. E. Wausn, ‘“‘Concerning compound randomization in the binary system,’’ unpub- 
lished manuscript, Project RAND, Douglas Aircraft Co., Santa Monica, Califor- 
nia. 

[3] R. von MiskEs, Probability, Statistics and Truth, The Macmillan Co., New York, 1939. 

[4] M. G. KenpALL anv B. B. Smiru, “Randomness and random sampling numbers,” 
Roy. Stat. Soc. Jour., Vol. 101 (1938), pp. 147-166. 

[5] M. G. KENDALL ANp B. B. Smita, ‘Second paper on random sampling numbers,”’ Supp. 
to Roy. Stat. Soc. Jour., Vol. 6 (1939), pp. 51-61. 

[6] G. U. Yute, ‘‘A test of Tippett’ s random sampling numbers,’’ Roy. Stat. Soc. Jour. Vol. 
101 (1938), pp. 167-172. 

[7] C. W. Vickery, ‘‘On drawing a random sample from a set of punched cards,’’ Supp. to 
Roy. Stat. Soc. Jour., Vol. 6 (1939), pp. 62-66. 


adly 
mon- 


vddi- 
leted 
ime, 
‘load 
eing 
re of 
atis- 
3 per 
f the 
din 


Vol. 


pub- 
ifor- 


39. 
rs,”* 


upp. 
Vol. 


p. to 


rr rr ee 


ON A MATCHING PROBLEM ARISING IN GENETICS 


By Howarp LEVENE 


Columbia University 


1. Summary. A statistic useful for detecting deviations from the Hardy- 
Weinberg equilibrium in population genetics is discussed. Both exact and 
asymptotic distributions are given and a special case where there is misclassifica- 
tion is discussed. The distribution obtained also arises from a certain card 
matching problem. 


2. Introduction. A system of multiple alleles behaves as follows under 
Mendelian inheritance: There are 7 distinct forms or alleles, a, ,--- ,a,, of a 
given gene. A given individual contains two genes and can be represented as 
a;/a;. If 7 = j the individual is called a homozygote; if 7 ¥ 7 it is called a 
heterozygote. The representation a;/a; is called the genotype. In reproduction 
each gamete produced by an a;/a; individual contains one gene which has a 
probability 1/2 of being a; and 1/2 of beinga;. In fertilization a paternal and a 
maternal gamete fuse to form a new individual which contains two genes, giving 
the well-known Mendelian ratios. We now consider a large random breeding 
population of N individuals. This will contain 2 N genes, of which the propor- 
tion qi will be of type ai = 1,---,7r; 2qi = 1). The probability that a 
random individual from the next generation will be a,/a; is qi(i = j) or 2q.q;(i ¥3), 
which are known as the Hardy-Weinberg equilibrium probabilities. The 
statistical problem arose in testing (by means of a sample of m individuals) the 
hypothesis that this Hardy-Weinberg ratio holds against the alternative hypothe- 
sis that disturbing forces decrease the number of homozygotes. The actual 
data has been discussed elsewhere [1]. 


3. The sample distribution of number of homozygotes. We shall assume 
throughout this paper that N is so large that random fluctuations in the pop- 
ulation proportions from generation to generation can be ignored. Let 


aij(t <j = 1,---, 7) be the number of a,/a; individuals in the sample, and let 
Yi = ii + Lj-12;; be the number of a; genes in the sample. We have =>2;; = n 
and Sy; = 2n. Let h = 22; be the number of homozygotes, and z = n —h 


be the number of heterozygotes in the sample. The probability of the observed 
sample is 


n! a 
re 2. or 29:9;)°* 
Tai ) II aia.) 





- i=l i<j 
(1) alt de 
a 4 | qu ~ 
TI zi! ani 
isi 


91 








92 HOWARD LEVENE 


Since the g; are unknown we use the conditional probability when y,--: , y, 
are held constant. Whenever we use the word “conditional” hereafter, this 
condition will be understood. The conditional probability is 

n!2? 


TT...» Where 
n!2? 
= 2! TI zi!’ 


t<j 


€ 


(2) 





where the summation >’ is over all non-negative integral values of the x;; sub- 
ject to the condition 


vi + Uti = Yi @ = 1,---,7). 
Consider 
T 2n 
(3) (= «) = =(L&+2Dd tt)" 
t<j 
4 sa * er. 
(4) Zz en at — 6 
ts] 


where the summation =* is over all non-negative values of the 2x;; subject 
to the condition 2;< ;2;; = n. Evidently 1/K’ is the coefficient of It?‘ in (4); 
but this must equal the coefficient of this term in the left member of (3); and 
thus 1/K’ = (2n)!/Iy;!. Hence the conditional probability of the observed 
sample is 


(5) —@ny [Daa 


For any function u(tu, +++ ,2ir,°** ,Xrr) We will now let E(u) and o'(u) 
denote the conditional mean and variance of u for fixed y; , and will refer to them 
simply as the mean and variance. We first obtain the sth factorial moment of 
vii, that is E(x), where x” = x(x — 1) --- (@ ~s +1). Consider 
2?n! 2-(n — 8)! 


(6) a ITs, sf? =n DU’: Want’ 











t 7 i ‘ 
where xj, = xj, except that x;; = x;; — s, and S’ has the same meaning as in (2). 
The right member of (6) is evaluated exactly as before, giving 


. as oo 
(7) E(x; { i) = (2n)e ° 


From this expression we obtain 


(8) B(a,) = WUD = ng? + 00), 


this 


sub- 


ject 
(4); 
and 
‘ved 


(2). 


A MATCHING PROBLEM 93 


and 
(2). (4) 


4 (2) (2) 2 
, n'y; ny; ny} L 
(9) o (xi) = (2n)® + (2n)® = a | _ nf iC = fi’ +> O(1), 


where f; = yi/2n is the sample estimate of q;. Similarly 





net (28), (2t) 


(9) 4 Yi Yj 
(10) E(a{Pa$)) = * Oa 
giving 
(11) alts, 2) — ryru? _ _yPy? 
pees) 5s 


~ (2n) 4(2n — 1)? 


Other moments can be similarly evaluated, in particular E(x;;) = yiy;/(2n — 1): 


4. Asymptotic distribution of number of homozygotes. From (8), (9), and 
(11) we may easily obtain 


(12) E(h) = TE(xi) = (C — 2n)/(4n — 2), 
(13) o(h) = So'(xii) + 2EZ o(xis , 233) 
i<j 








(14) = Bom +2) + 0° (45) — p("*?)h 4 40/3), 
8n? n n 


where C = Sy? and D = Sy}. The formula (14) is a close approximation to 
(13) and is easily computed. From (5) by means similar to those classically 
used to prove asymptotic normality of the binomial distribution we can prove 
asymptotic normality of the conditional distribution of h; more precisely, if 


n— © and y;/n — constant (¢ = 1,---,7r), then 
(h — E(h) \ 1 f _ 
~ < a z*j2 . 
(15) Pro 0b} ah) <t ee ~ | é dx 


5. Effect of misclassification. ‘There is a further complication in the particular 
case reported in [1]. All individuals of genotype a;/a; are correctly classified, 
but an individual of genotype a;/a; (¢ ¥ 7) has a known probability p/2 of being 
classified a;/a; and an equal probability of being classified a;/a;. As a result, 
the observed proportion of homozygotes is a biased estimate of the proportion in 
the population. Let h, xi; , y; denote the true sample values, and let h’, Lii Yi 
denote the recorded sample values. Then h* = h’ — e, where e = (n — h’) 
p/(1 — p), will give an unbiased estimate, 7.c. E(h*) = E(h). In order to use h* 
we must have its (conditional) variance. Since h* = np/(1 — p) + h’/(1 — p), 


che = [1/(1 — p)l one. 


Let h — h’ = e, then for large fixed (n — h), € is approximately normally dis- 
tributed with mean (n — h)p and variance 








94 HOWARD LEVENE 


(n — h)p(l — p) = [n — E(h)|pUl — p){l + O,(1/Vn)]. 


Neglecting the remainder term in this variance, ¢ and h have a joint normal 
distribution with parameters that are easily calculated. We thus have 

2 2 2 2 7 2 2 

Oh = 6, toe + 2ol(h, €), or op = [n — E(h)|p(l — p) + (1 — p)yo:, 
giving 

. 2 2 
(16) one = o, + [n — E(h)|p/(1 — p). 

9 ‘ 2 . —" -— r . 

In [1] o;. was given as o, + e for the sake of simplicity. This would tend to be 
smaller than (16), but only negligibly so. Strictly speaking the calculation of 
E(h) and o; from (12) and (14) requires a knowledge of the true y;, but the 


observed y; are unbiased estimates of the y; and their use should cause no 
serious trouble. 


6. Combinatorial statement of the problem. This problem can also be 
expressed as one of card matching as follows: A deck contains 2n cards of r 
different suits; with y; cards of the 7th suit (¢ = 1, ---,r). We draw n pairs of 
cards at random without replacement, exhausting the deck. What is the 
distribution of h, the number of twins (pairs in which both members are of the 
same suit). If z= n — h, the probability of exactly h twins is given by (5), and 
in the limit h is normally distributed with mean given by (12) and variance 
given by (14). The card matching problem does not involve the notion of 
conditional probability. By introducing variables wa equal to one if the ath 
pair is a twin and zero otherwise, the moments of fh can also be obtained with- 
out using generating functions. 


REFERENCE 


[1] THEopos!rts DoBzHaANsKy AND Howarp LevENE, ‘‘Genetics of natural populations. 
XVII. Proof of operation of natural selection in wild populations of Drosophila 
pseudoobscura,”’ Genetics, Vol. 33 (1948), pp. 537-547. 


mal 


» be 
1 of 
the 

no 


be 
fr 
3 of 
the 
the 
und 
nce 

of 
rth 
th- 


yns. 
Lila 


A MULTIPLE DECISION PROCEDURE FOR CERTAIN PROBLEMS IN 
THE ANALYSIS OF VARIANCE 


By Epwarp PAavuLson 


University of Washington 


1. Introduction. In this paper we will discuss a certain type of problem 
which arises in many applications of the analysis of variance. We suppose 
that we are given K varieties, and are required to investigate the differences 
among them on the basis of the observed yields from a given experimental 
design, such as a set of randomized blocks or a latin square. The classical 
procedure [1] for dealing with this problem has been to test the null hypothesis 
that the A varieties are all equal by computing the ratio of the mean sum of 
squares between varieties to the residual mean sum of squares, and rejecting 
the null hypothesis whenever this ratio exceeded the critical value corresponding 
to the level of significance used. However, the standard discussions of this 
procedure seem to be quite vague on the question of what action should be taken 
after the null hypothesis has been rejected. 

In a number of problems, the practical situation seems to be such that instead 
of testing the null hypothesis that the varieties do not differ, what is really 
required is a statistical rule or ‘decision function” which on the basis of the 
observed yields will élassify the K varieties into a ‘superior’ group and an 
“inferior” group. If the superior group consists of more than one variety, 
the next appropriate action will of course depend on the particular problem at 
hand. In some situations the varieties in the superior group might then be 
subject to further selection on the basis of some secondary characteristic, or 
additional observations might be taken to discriminate between the members 
of the superior group, after discarding the varieties in the inferior group. How- 
ever, if all varieties happen to be classified in one group, the group will be 
labelled ‘‘neutral” and this result is to be interpreted as implying that the 
varieties are homogeneous. 

In this formulation, the problem is now of a multiple decision type; it is 
necessary to decide on the basis of a sample which one out of the 2* — 1 possible 
decisions (or classifications) to select. We will suggest a solution which seems 
quite reasonable on an intuitive basis, but it is still an open question whether 
this solution is an optimum one. 


2. A special case. In this section we will discuss the problem under the 
assumption that the variance o” of a single observation is known a priori. This 
is a rather restrictive assumption, but it can be considered as approximately 
satisfied when the number of degrees of freedom available for estimating the 
variance is large, which will often be the case. The minor modifications neces- 
sary to secure exact results for the small sample case when ¢ is unknown are 

95 








96 EDWARD PAULSON 


discussed in section 3. We also assume that the experimental design has been so 
selected that there will be the same number (r) of observations on each of the K 
varieties. 

Now let vi = the ath observation on the ith variety (¢ = 1,2,---,K;a= 
1,2,---,7r), let @ = D g21 Lia/r, put m; = E(%,;) where E stands for expected 
value, and take \ to be a given positive constant. The conventional assumption 
is made that all the observations are normally and independently distributed 
with the same variance o°. Denote by %y the maximum of the K mean values 
%1, %2,°++,%x. The rule for dividing the varieties into superior and inferior 
groups is the following: the superior group is to consist of all varieties whose cor- 
responding mean values fall in the interval [#4 — \o/+/r, Zu] and the remaining 
varieties constitute the inferior group. (As mentioned earlier, if all the varieties 
fall into one group, this group is labelled ‘neutral’ and the varieties are considered 
homogeneous. ) 

This rule completely determines the classification as soon as \ is determined. 
For a given sample size, we might select \ by considering the relative importance 
of different types of incorrect classifications. If H denotes the error of mis- 
classifying the varieties when in fact they are all equal, and G denotes the error of 
misclassifying the varieties when they actually are unequal, then it is obvious 
that the greater the value of \, the smaller the probability of an error of type H, 
but the greater the probability of an error of type G. Therefore for a given 
value of r it is necessary to adopt some sort of compromise in selecting X. 

For a given value of \ we will now derive explicit formulas for P(H), the 
probability of not classifying all the varieties in one group when m =m: = --+ = 
mx , and for P(G,) the probability that as a result of the experiment there will 
not be a superior group consisting only of the Kth variety when m, = m = --: = 
Mr. = mand mr= m+ A(A > 0). G;, was selected because it appeared to be 
the particular kind of type G error most likely to be useful in applications. 
Also P(G:) may be regarded as the least upper bound of the probability of 
misclassifying the varieties when one variety is superior to any of the others 
by an amount at least equal to A. Now if we denote by W = (%4 — Znin) 
the difference between the maximum and minimum values of the set {2;} 
(¢ = 1, 2,--- , K), then it is obvious that 


(2.1) 1 — P(H) = P{W < ¥. 


The right hand side of (2.1) is equivalent to the probability that the range of a 
sample of K independent observations from a normal distribution with unit 
variance be less than ); this probability has already been tabulated by Pearson 
and Hartley [2]. From these tables it is a routine matter to find P(H) cor- 
responding to a given value of \, and conversely. To evaluate P(Gi), we have 


, r . ‘ 
1 — PG) = Pfs, < ae — for each 7 (= 1,2, --,K — ph. 


3. 


= ww © 


MULTIPLE DECISION 97 


By evaluating the probability of this event for a fixed value of 7x and then 
integrating out with respect to tx , it is a simple matter to verify that 


ry 1 : —(y?/2) 1 — _ . e l2) - 
(2.2) P(@) =1- ox [-e wa | dt dy. 


In some applications, it may be desirable to have an explicit expression for the 
probability that the superior group will consist of the Ath variety and not more 
than s inferior varieties when m, = m = +--+ = Mg; = mandmr = m-+A. 
If we denote this probability by 1 — P® it is not difficult to show that 


it~ m= F - i. ') [Tia + aT], where 


a=0 


. 1 c ev?) E- — aii ‘ K—a—1 
le = \/ 2x . /2n a e l 


yt(Alo)V/r 


2.3 .- [ 
= E- yt(A/o)4/7—d 
T= io [ —(y?/2) E- 0 ay K—a-1 
oo / 20 a é \/ on ! e 
1 y —(t2/2) T [ 1 y—(Ala)/r dill 
. —— je dt ae | a) dt . 
Vaz he : V2 y—(A/s)+/r—d F y 


3. General case. We now briefly discuss the exact treatment of the problem 
when o is unknown. The notation of section 2 will be used, but in addition 
denote by s° an estimate of o resulting from the given experimental design 
which is based on the residual sum of squares with n degrees of freedom. It is 
well known that s° is independent of the set {%;} (¢ = 1, 2, --- K). Now the 
rule to be used in classifying the varieties into two groups is as follows: the 
superior group is to consist of all those varieties whose mean values fall in the 
interval [zy — As/+/r, a], and the inferior group consists of the remaining 
varieties. 

We now find that: 

(3.1) 1 — P(H) = P{W <)s/vr}. 

The right hand side of (3.1) depends only on the distribution of the ‘studentized’ 
range and has also been tabulated by Pearson and Hartley [3] although the 
tabulation is considerably less complete than that of the range in [2]. It is also 
easy to verify that the —— for P(G,) now becomes 


[ [ix 1 (aw? +y?) /2 


1 yt(A/a)r/7r—dw 7” K-1 
| Sa | wm a| dy dw 


grr aul dy, and 


(3.2) = vaarr(8) Oh 


. ‘ * ° ° * 
with a similar modification for Ps; 








98 EDWARD PAULSON 


4. Remarks. Any application of the ideas suggested here would be greatly 
facilitated if tables of P(G;) were made available. If this were done, it would be 
possible to decide in advance of an experiment how large r should be in order 
to have a fixed control over both types H and G; errors. It is obvious that 
further research both along theoretical and applied lines is needed. In conclu- 
sion, the writer would like to thank Professor Albert Bowker for several helpful 
suggestions. 


REFERENCES 


[1] R. A. Fisuer, Statistical Methods for Research Workers, Chapters 7, 8. 

[2] E. S. Pearson anv H. O. Hart ey, “Tables of the probability integral of the range in 
samples from a normal population,’ Biometrika, Vol. 32 (1941-42), pp. 301-310. 

[3] E. S. Pearson anp H. O. Hart ey, ‘‘Tables of the probability integral of the student- 
ized range,’ Biometrika, Vol. 33 (1943), pp. 89-99. 


~~ a 


1 


.. MODIFIED EXTREME VALUE PROBLEM 


By BeENJAMIN EpsteEIn! 


Coal Research Laboratory, Carnegie Institute of Technology 


1. Introduction and summary. Consider the following problem. 

Particles are distributed over unit areas in such a way that the number of 
particles to be found in such areas is a random variable following the law of 
Poisson, with v equal to the expected number of particles per unit area. Further- 
more, the particles themselves are assumed to vary in magnitude according 
to a size distribution specified (independently of the particular unit area chosen) 
by a d.f. F(x) defined over some interval a < x < b, with F(a) = O and 
F(b) = 1. .The problem is to find the distribution of the smallest, largest, or 
more generally the nth smallest or nth largest particle in randomly chosen 
unit areas. 

The problem as stated is not completely specified. To specify the distribution 
of smallest or largest particles in a unit area one must give a rule for dealing with 
those areas which contain no particles at all. More generally, in the case of the 
distribution of the nth smallest or nth largest particle, one must give a rule for 
dealing with those areas which contain (n — 1) or fewer particles. There are at 
least two possible alternatives. One alternative is to omit none of the areas 
from consideration by setting up the following rule: if no particles are found in a 
given unit area then this area will be considered as one for which the smallest size 
particle is x = b and for which the largest size particle is x = a. More generally, 
if (n — 1) or fewer particles are found in a given unit area then this area will be 
considered as one for which the nth smallest size particle is x = b and for which 
the nth largest size particle isz = a. A second alternative is to restrict attention 
to those areas which contain at least one particle (in the case of the distribution 
of smallest or largest values) or at least n particles (in the case of the distribution 
of the nth smallest or nth largest particle). In other words, this means finding 
the relevant conditional distribution. 

From the point of view of the application of the theory of extreme values to 
fracture problems, there are some situations where the first model and other 
situations where the second model is the more appropriate in describing the 
phenomenon under investigation. In this paper section 2 will be devoted to a 
derivation of the distributions associated with the first alternative; in section 3 
the conditional distributions will be described briefly. 


2. The distributions under the first alternative. In this section we shall 
be concerned with the first alternative. To find the distribution of the nth 
smallest particle in unit areas, we first observe (the verification is left to the 


1 Present address, Department of Mathematics, Wayne University, Detroit, Michigan. 
99 








100 BENJAMIN EPSTEIN 


reader) that under the hypotheses of section 1, the number of particles having 
size <2 in a unit area is distributed according to the law of Poisson, with 
expected number equal to vF(x). Next we note that the probability that the 
nth smallest particle in a unit area exceeds x in size is equal to the probability of 
finding exactly 0, or exactly 1, or exactly 2, --- , or exactly (n — 1) particles of 
size <xin that area. Therefore G,(x), the probability that the nth smallest size 
particle in a unit area is < 2, is given by 


“——r ) 
; it ~vp(2) (vF(2x))’ ; 
(1) G(x) 1 X é jt . x ~ b; 


= |, x > b, 


where we have assigned to the size x = b the probability =7=) e “(v’/j)! which is 
just equal to the probability of finding fewer than n particles in a unit area. 

If the d.f. F(x) has a derivative f(x) for all x lying ina < x < b, then G,(z) 
has a derivative for any value of x ¥ b. Therefore the probability density for 


the nth smallest size particle is, for any x ¥ b, given by the function g,(x) where 


gn(x) si eo PFC) — vf(x), 


. z<a, z> 6. 


(2) ax<2z<b; 


= 0 
j 


n—1 
A finite probability >» "5 is assigned to x = b. 


! 
j=0 : 
If one makes the transformation y = vF (x) (for a similar transformation in 
extreme value theory see [1, page 371]), then (1), and (2) become 


n—1 j 
Gy) =1- De®, y <9; 
(v") a i 
=l y2,y, 
and 
>. 
(2’) gry) = (n ,— 1)!’ 0 < y < V; 


= 0, y <0, y>». 
yi 
A finite probability ¥ oe is assigned to y = ». 
7=0 . 
The distribution of the smallest size particle in a randomly chosen area is 
found by letting n = 1 in equation 1. 
In a similar way one can find the distribution of the nth largest particle in a 
randomly chosen unit area. H,(x), the probability that the nth largest size 
particle in a unit area is <2, is given by 


A MODIFIED EXTREME VALUE PROBLEM 101 


H,(x) = 0, s < @; 


a # eo ?I-F(2)) [v(1 — Fe 


(3) 


IV 
2 


’ x 


n—1 I 
where we have assigned to the size x = a the probability > : = 
j=0 i! 

If, as before, F(x) is assumed to have a derivative f(x) for all x lying in 

a < x < b, then the probability density for the nth largest size particle is, for 

any x ¥ a, given by the function h,(2) where 
~va-rizy) WL — F(a)" 
Lipa Ore ea a<axz<b; 
(4) (n — 1)! ; . 


= 0, x<a, g> 6 





n—1 i 
A finite probability >> ers is assigned to x = a. 
j=0 : 


If one makes the transformation z = »[1 — F(zx)], then (3) and (4) become 


n—1 j 
* —2@ 
(3’) H(z) =l1- Xe jv Z2<d; 
= 1, Z>y, 
and 
—z n—l 
Bd @ gm o< , 
(4’)- @) (n — 1)!’ = es 
= Q, z2<0, 2>Y, 
n—1 y 
with a finite probability > 0 ii assigned to z = ». 
j=0 ! 


The distribution of the largest size particle in a randomly chosen unit area is 
found by letting n = 1 in equation 3. 


3. Conditional distributions of the extreme values. The appropriate con- 
ditional distributions for the problem under consideration can be written down 
readily. The step function component which occurred in section 2 is no longer 
present since we restrict our attention only to those areas which contain at Icast 
n particles (in the general case of the distribution of nth smallest or nth largest 
size particles). 

G‘.(x), the d.f. of the nth smallest particle in a unit area chosen at random 
from the class of areas containing at least n particles, is given by 

G5(x) = 0, 2 <6; 


n—1 





1— De (WF (2))’/j! 
(5) 1 meee a<x<b; 
_ >> &/j! 
j=0 
= |, x> b. 








102 BENJAMIN EPSTEIN 


Similarly H%,(x), the d.f. of the nth largest particle in a unit area chosen at random 
from the class of areas containing at least n particles, is given by 
H;(z) = 0, z <a; 


n—l 


n—l 
> e7-F@IE ead F(x)))? /j _ » ey /j! 


(6) = = o--3 7 
1-— De’ /j! 


7=0 


7=0 


= |, > & 


4. General remarks and an application. It is interesting to note that the 
assumptions of section 1 lead to distribution functions in section 2 which are 
precisely the same as the asymptotic distributions of smallest, largest, or nth 
smallest, or nth largest values in samples of fixed size N(N — ~) (see eg. 
[1, p. 371]). In the problem treated in this paper, v, the expected number of 
particles in a unit area, plays the role of N in the fixed sample size case, with the 
important difference that the distributions in the present paper are exact and 
not merely asymptotic. 

The results of this paper have a direct bearing on certain aspects of fracture 
problems [2] and in particular on the dielectric breakdown of capacitors [3]. 
In the latter problem there appears to be ample justification for assuming that 
the breakdown voltage is influenced to a considerable degree by the presence of 
flaws known in the technical literature as conducting particles. These particles 
are spread individually and collectively at random throughout the area of the 
capacitor and, depending on their size, create a local weakening of the capacitor 
by reducing the nominal insulation thickness in the neighborhood of flaws. 
The voltage required to break down the capacitor is equal to that required to 
break it down at that spot where the greatest penetration has taken place. 

In the dielectric problem the statistical distribution of largest values ap- 
propriate to the problem is given by (3) with n = 1, and the size distribution of 
conducting particles follows a law of the form f(z) = \e “%, 2 > 0. Thisisa 
situation where all the capacitors under test are part of the sample (since all 
must be tested to destruction) and those which happen to contain no defects (an 
event with probability e”) act as if the largest particle size is equal to a = 0. 
e” simply represents the expected fraction of capacitors which have strength 
equal to the theoretical strength of the insulation. 

The conditional distributions of section 3 would be more appropriate in the 
following sort of practical situation. Suppose that surface flaws spread at 
random on glass rods are known to reduce greatly the strength of the rods. 
Suppose that in a given sample of glass rods one takes out by some method of 
inspection those specimens which have no flaws. Then the strength distribution 
of the remaining specimens is a conditional distribution since each specimen must 
contain at least one flaw to be eligible as a member of the sample. 


— seemenmment 


A MODIFIED EXTREME VALUE PROBLEM 103 


REFERENCES 
[1] H. Cramér, Mathematical Methods of Statistics, Princeton University Press, 1946. 
[2] B. Epstein, “Statistical aspects of fracture problems,’ J. Applied Phys., Vol. 19 
(1948), pp. 140-147. 
[3] B. Epstern anp H. Brooks, ‘“‘The theory of extreme values and its implications in the 


study of the dielectric strength of paper capacitors,’’ J. Applied Physics, Vol. 19 
(1948), pp. 544-550. 








ON DISTINCT HYPOTHESES 


By AGNES BERGER AND ABRAHAM WALD 
Columbia University 


1. Introduction. The following problem was suggested to one of the authors 
by Professor Neyman: 

Let X = (X,, Xs, ---,X,) be a chance vector and let h denote any simple 
hypothesis specifying its distribution. Let H; be the composite hypothesis 
that some element h of a set of simple hypotheses {h};, (¢ = 0, 1), is true, and 
assume that Hy and H,; are known to be exhaustive. Let h; denote an element of 
{h}; (¢ = 0, 1). 

For any region W of the sample space S, let P(W | h) be the probability that 
the sample point falls in W when h is true. 

We shall call Ho and A, distinct, if a region W exists for which 


for all ho € {h}o 
and all h; € {h},. 


The problem is to establish necessary and sufficient conditions for two composite 
hypotheses Hy and H, to be distinct. 

For any critical region W for testing Hp against H, , let y(W | h) be the proba- 
bility of a wrong decision when h is true, i.e. 


; (P(W | h) for he Ho 
y(W |h) = 3 ; 
\1— P(W\h) for heM. 


Suppose now that H and Hj, are not distinct. Then to any W a pair ho , hy 
exist such that 


P(W | ho) ¥ P(W | hy), 


P(W | ho) = P(W | hy), 
thus 
y(W | ho) = 1 — y(W| hi), 
and therefore 
(1.1) l.u.b. VW | h) > 3 for any W. 


This property of non-distinct hypotheses leads us to investigate the conditions 
under which 2 hypotheses allow a test where the maximum probability of a 
wrong decision is < 3. 

The result, in turn, will enable us to state, for an important class of hypotheses 
a necessary and sufficient condition for 2 composite hypotheses to be distinct. 


2. Alemma. We shall now prove the following lemma: 
Lemma 2.1. Assume that X has a density function p(x) and let H; = h; be the 
simple hypothesis that p(x) = pi(x), (¢ = 0,1). Assume that the set R of x’s 
104 


ON DISTINCT HYPOTHESES 105 


satisfying po(x) * p(x) has a positive measure. Then there exists a region W 
such that y(W | pi) < 3,7 = 0,1. 
Proor: Let Rp be defined by po = pi, Ri by po < pi, Re by po > pi. Since 


[r@ dx = 1 and p,(x) = 0, (¢ = 0, 1), Ri and R, are of positive measure. 
8 
Let 


[ms in R; 
¢o(z) = \* in R; 
Di = Do in Ro ‘ 


Then [ (x) dx > 1 and either 
8s 


a) / pi dx > 
Rit+Ro 


or both. Assume first a). 


Let R; € R, + Ry and such that / o, dz = 1, but / m» dz < }. This 
R3 R3 


tol 


or b) [ mar >} 
Re 


2° 
can be done by including into R; a part of R; of non-zero measure. Let Ry C R, 


+ R,) — R; and such that 0 < / pdx<3- / po dx. Then 
2 R4 R3 


/ pode < | pile <3 — | po dz, thus | Po dx < 4 but | pdx > i. 
R4 R4 R3 R3tR, R3tR,4 
Assume now b). 


Let R; C R, and such that / Po dx = }. Then [ Pi dx < 3 
Rs R5 


2° 


Let Re C Re. — Rs and such that 0 < / pdx << 3- / pi dx. Then 
Re R5 


/ Po dx > band [ pi dx < 3. 
RotRe R5tRe 


Thus in case a) W = R; + Ry, and in case b) W = S — R; — Reis a critical 
region for which y(W | p:) < 4 (¢ = 0,1). This proves the lemma. 


3. The main theorem. Assume now X to have a density function p(z, | @) 
where 6 = (0;, 64, °°: , 6%) is an unknown parameter point. Let wo and o, 
be two disjoint, bounded and closed subsets of the k-dimensional 6 — space. 
Let 2 = wo + w, and suppose that @ is known to belong to Q, which therefore 
will be called the parameter space. Let H; be the hypothesis that the true 
parameter point is an element of w; , (¢ = 0, 1). 

We shall consider the problem of testing Hp against H,. Clearly, P(W | h) 
can now be written as P(W | @) and y(W | h) as y(W | 8). 

We shall make the following assumptions concerning p(z | 6): 








106 AGNES BERGER AND ABRAHAM WALD 


Assumption 1. p(x | 0) is continuous in 6. This is of course always fulfilled 
if 2 consists only of a finite number of points. 
Assumption 2. For any bounded domain M of the sample space we have 


/ [Max p(x |0)]) dx < ~. 
mM 6 
It follows from Assumptions 1. and 2. that 


(3.1) lim [ p(x | 0) dx = 0 


Tr=0c 


uniformly in 6 where S, is the sphere in the sample space with center at the 
origin and radius r. 


In what follows, whenever we shall speak of cumulative distribution function 
g(@) in the k-dimensional parameter space, we shall always mean a cumulative 
distribution function satisfying the condition 


j ee «1 


For any c.d.f. g(@) let W, denote a critical region which contains any sample 
point x satisfying the inequality 


[ r@\o a9) > [ velo ag, 


and does not contain a sample point x for which 


[ velo d9) < [ ve\o a90. 


It can easily be verified that W, minimizes the average risk 


(3.2) [ (W | 6) dg(@), ice., [ W.| 6) dg(@) = Min [ > (W | 6) dg(6). 


Let Q; (¢ = 0, 1) be the class of all density functions p(x) = / p(x | 6) dg;(@) 
2 


where g;(@) is subject to the condition 


[ “ane «4, 


Two density functions p(x) and q(x) are said to be equal if p(x) ¥ q(x) holds 
only in a set of measure zero. 

It follows from (3.1) and Assumptions 1. and 2. that y(W | @) is a continuous 
function of 6. Let y(W) denote the maximum of (JW | 6) with respect to 0. 
We shall prove the following theorem: 

THEOREM 3.1. A necessary and sufficient condition for the existence of a region 
W such that y(W) < 3 is that the classes Q, and Q, be disjoint. 


ON DISTINCT HYPOTHESES 107 


Proor. Suppose that 2 and Q are not disjoint. Then there exist two 
distribution functions go(@) and g:(@) such that 


[aot = [aout = 1 


and 


[ v@\o aq) = f ple|a) dox(o) 
wo @1 
(except perhaps for points x in a set of measure 0). 


Let g(0) = % go(@) + 4 gi(8). Clearly, y(W) > [vw | 0) dg(6) = 3 for 


any W. This proves the necessity of our condition. 

We shall now assume that 2 and Q; are disjoint. First we shall show that the 
results of [1] can be applied. On pages 297-8 of [1] there are seven conditions 
listed for the sequential case. For the non-sequential case (the one considered 
here) the conditions 6 and 7 drop out and the first five conditions can be reduced 
to the following conditions: 

Condition 1: The weight function W(6, d) is bounded. 

Condition 2: For any 0, the chance vector X admits adensity function p(z | 6). 

Condition 3: For any sequence {6;} (¢ = 1, 2,--+ , ad inf.) there exists a sub- 
sequence {0;} (j = 1, 2,---) and a parameter point 4 such that 


lim p(x | 6:;) = p(x | 4) 


Condition 4: If {0;} (¢ = 1,2, ---) is a sequence of points and a point such that 
lim p(x | 6:) = p(x | 6) 
then, 
lim W(6;,d) = W(@, d) 


uniformly in d. 

Condition 5: The same as our Assumption 2. 

In our problem d(the decision of the statistician) can take only two values: 
acceptance or rejection of Hj). Condition 1 is evidently fulfilled, since W(@,d) = 0 
if a correct decision is made, and = 1 if a wrong decision is made. Clearly, 
Conditions 2-5 are also fulfilled in our problem. 

A distribution g(@) is said to be least favorable, if it maximizes the minimum 
average risk, i.e., if it maximizes [ v(W | 6) dg(@) with respect to g. 

It follows from Theorems 4.1 and 4.4 of [1] that there exists a least favorable 
distribution. 


Let g*(@) be a least favorable distribution. Then, as has been shown in [1] 
there exists a W,. such that 








108 AGNES BERGER AND ABRAHAM WALD 


(3.3) Max y(W. |) = | ¥(Woe | 6) dg*(6). 
6 Q 

Thus, our theorem is proved if we can show that 

(34) | v0 |0) dg(0) < 4. 


Let Ho be the hypothesis that the true density is given by 


[, ein are 
p(x) = —“—_______ , 
[ 
and Hy the hypothesis that the true density is given by 
/ p(x | 6) dg*(6) 


p(x) = 
[ ar) 
Since % and Q are disjoint, po(x) and p.(x) are different density functions. 
Hence, penne to Lemma 4 1, there exists a critical region W* for testing Ho 
such that a* < } and 6* < 3, where a* is the probability of type I error, and 
8* is the probability of type II error. Clearly, 


(3.5) > a* [ ag*(@) + 6* | dg*() = [ + (W* | 6) dg*(6). 


tol- 


Hence, our theorem is proved. 

It follows from (1.1) that if Hy and A; are not distinct, % and Q, are not 
disjoint. 

On the other hand, suppose that 2 and Q; are not disjoint and let 


[ve agoto) = [pla ) don. 
Then for every W 
(3.6) [ Paro age) = [| PAV |e) an). 
wo @} 
Assume now that w; is a connected set (¢ = 0, 1). Then, because of the 
continuity of P(W | @) there exist 2 functions @(W), @:(W), 6:;(W) belonging to 
w;(t = 0, 1) such that ‘ 


P(W | @(1V)) =| P(W | 6) dgo(6) 


and 


ON DISTINCT HYPOTHESES 109 


PW | 0(W)) = [| POV | 6) dao) 


for every W. Hence, because of (3.6), 
P(W | @(W)) = P(W | 6,(W)) 


for every W. Thus, we arrive at the following theorem: 

THEOREM 3.2. If w; is a connected set (i = 0, 1), then, under the assumptions 
of Theorem 3.1, a necessary and sufficient condition for Hy and Hy, to be distinct 
is that the sets % and Q, be disjoint. 


REFERENCE 


(1] A. Waxp, “‘Foundations of a general theory of sequential decision functions,’? Econo- 
metrica, Vol. 15 (1947), pp. 279-313. 








AN APPROXIMATION TO THE SAMPLING VARIANCE OF AN ESTI- 
MATED MAXIMUM VALUE OF GIVEN FREQUENCY BASED ON FIT 
OF DOUBLY EXPONENTIAL DISTRIBUTION OF MAXIMUM VALUES! 


By Braprorp F. KIMBALL 


N.Y. State Department of Public Service 


1. Introduction. Given the doubly exponential distribution of maximum 
values 


(1) F(z) = exp (—e™), p= a(x oe u), 


where a and uw are unknown parameters, with a prescribed frequency Fy the 
“reduced variate” y is fixed, say at y = yo. Thus with 


Fy = .99, yo = 4.60015 --- 


Given a sample of n maximum values x; , we are interested in the sampling 
variance of 
(2) £ = g(t, a) = G+ w/a 
due to sampling variations of the estimates @ and a. 

H. Fairfield Smith has recently pointed out to me that the examples of applica- 
tions of sufficient statistical estimation functions to this problem given in a 
previous paper (see [1, pp. 307-309]) give too large a range for @ = g(t, a) 
because the sample points (#@, &) within the confidence region of the constant 
probability ellipse apply to optimum estimates of (a, a) rather than to that of 
g = g(a, &). What the problem calls for is the determination of the positions of 
curves g(u, a) and g(u, a) such that the integral of the pdf of the estimation 
functions over all sample values (a@, @) which lie between these two curves is 
equal to the confidence level (taken as .95 in previous paper). Further con- 
siderations of this being the shortest interval g — g, also come into play. 

As so often happens in research, the previous analysis, although not giving the 
final answer, suggests the next step. If we change our parameters to 
(3) g = g(u,a) = ut w/a, a’ =a 
and are able to carry through the inverse of the maximum likelihood solution 
for fitting of (1) to n sample values x; , then we shall be in a position to find the 
asymptotic marginal distribution of ~/n(g — g), which will give the answer to our 
problem (see [2]). 

The Jacobian of this transformation of parameters is 

‘1 yo/ a” 
d(u, a)/d(g, a’) = | j=l 
[Oo 1 | 


and hence for a’ > 0 no new singularities are introduced. 


? 


1 This involves a correction of a previous paper [1]. 
110 


AN APPROXIMATION TO A SAMPLING VARIANCE lll 


2. The equations of the maximum likelihood solution. For a sample of 
size n, the pdf of the sampling distribution in terms of the old parameters is 
given by 

Plu, a, O,(x;)] = a” exp [—Ze *"*™] exp [—Ta(z; — u)], 
and 


—a(zrj—u) 


log P = n log a — Ze — alxz; + nau; 


nilog a — e“(Se **/n) — a& + an. 


Now change to the new parameters and use the substitutions: 
za=n2e", Zz = (2z;)/n, 2a=e” = e*-¢ 
Thus | 
daYog = —a’a, 0z/da’ = —gz, 
and denoting log P by L we write 


L = nilog a’ — 2/z% — a/% + a’g — yo. 


Hence 
(4) L, = —na'[z/z — 1); 
(5) - La = n{l/a’ — 0(2/z)/da’ — = + gl}. 


3. Derivation of expected values needed. Recall that 
2/2 = EM Dg 20) Jy - Set) /, 
Hence 
(6) 9(2/%)/da’ = —e 3(x; — g)e * **” /n, 
0(2/z)/da = —X(x; — ue /n; 
(7) 8(2/x)/da” = & “Z(a; — gle", 


42/5 2 ss i 
0° (2/2) /de = X(x; — u)e **™ /n. 


By investigation of the generating function 
GQ) = E[S@i/a)"'], «=e 


it can be shown that 


E{se"***™ /n] = 1, 
E{S(2; — uje *"*™ /n] = —(1/a)I’(2) = —(1/a)(1 — C), 
where C denotes Euler’s constant, .577216 --- , and 


E[(S(a; — ure *"*™ /n] = (1/07) TP" (2) = (1/a")(2"/6 + C’ — 2C). 








112 BRADFORD F. KIMBALL 


Hence to find expected values of (6) and (7) we note that 


—a'(zz—g) / 


—e S(a; — g)e /n = —X(xi — g)e 


—a(zrj—u) 


/n; 
= —ZX(a; — ue” /n + (yo/a)ze *™ /n, 

and therefore 
(8) E[0(2/2)/da’| = E[d(2/20)/da] + (Yo/a) E(2/z0). 
Similar analysis shows that 
(9) Elo*(2/z0)/da"] = E[d°(Z/z0)/da°] + (2yo/a)E[A(2/z)/da] + (yo/a’)E[2/z9). 

4. The inverse of the maximum likelihood solution. It will first be noted 
that the maximum likelihood equations (4) and (5) for determining best estimates 
of g and a’ become identical to those for determining best estimates of old 
parameters u and a, when the transformation of parameters (3) is applied to 
them. This is easily verified by applying relations developed above.” 


This means that the best estimates g and @’ obtained from (4) and (5) are related 
to the best estimates of old parameters & and & by 


(10) fjut+n/ee od 


We now proceed to set up the inverse of the maximum likelihood solution. 
In order to do this we first need the variance-covariance matrix of the direct 
solution. This is (see [2]) 


| Bl—Lye] El —Lee'l || 
| E[—Le's] El—Lera’ || 


Now 
Ly = —na(Z/z),  El—Ly] = na”, 
Lgar = —n{2/2 — 1 + a/8(2/z)/da'], — E[Lya’] = n(1 — C +m); 
Lata’ = —n[l/a” — 8 (2/2) /da’”I, 


E(—La'a’] = (n/a’)[x°/6 + (1 — C + yo)’]. 
Thus the variance-covariance mafrix of the estimation functions (4) and (5) is 
| nal” n(l1 — C + w) } 
9 | . 
|n(l - C + yo) (n/a”)[x"/6 + (1 — C + w)'] || 
The asymptotic form of the inverse solution for ~/n (g — g) and ~/n (a’ — a’) 
will have the variance-covariance matrix which is the reciprocal of the above 
matrix, multiplied by n. The determinant value of the above matrix reduces to 
n’(x’/6). Thus the reciprocal matrix, adjusted by multiplying by n, is 


2See equations (5.2) of [1] and note +0(2/z9)/da in second equation of (5.2) should 
read —0(2Z/29)/Oa. 


-_- 


AN APPROXIMATION TO A SAMPLING VARIANCE 113 


| A/a) + 1 — € + yo)’/(a"/6)] —(L — C + yo)/(a*/6) | 
—(1 — C + w)/2°/6) a / (x /6) | 

This gives the solution sought. From the general theory of the maximum 
likelihood solution (see [2]) the distribution of [\/n(g — g), W/n(a’ — a’)] is 
asymptotically normal. Hence the marginal distribution of ~/n(g — g) will be 
asymptotically normal, and for finite n, the standard deviation may be approximated 
by 
(12) o(9 — g) = [1/(Vna’)| V1 + (1 — C + y)?/(x?/6). 


Now the correlation coefficient for the asymptotic bivariate normal distribution 
is seen to be 


(11) 


r= —(1 — C + yo)/V 2/6 + (1 — C + w)?. 
If a’ were known, we should have the standard deviation of +/n(g — g) reduced 
by factor ~/1 — 72. This is found to be equal to the reciprocal of the second 


factor in the equation (12). Hence we conclude that 7f a’ be known, the standard 
deviation of (g — g), for finite n, is given approximately by 


(13) o(G — g) = 1/(V ne’). 





5. An example. Using same example outlined in previous paper (see [1, 
pp. 307-309]), we have n = 57, &’ = .01924, 1 — C = .422784, yo = 4.60015. 
This gives ¢ = 27.826. For 95% confidence interval we take (1.96)o = 54.54, 
and with z = 180.6, 
and the interval is approximated by 

\9 —g| < 54.5, 
which as an approximation gives the symmetrical interval 
365.2 < g < 474.2. 


Method 4 used in previous paper gave the longer interval (see Introduction) 
which was not symmetrical about g; 


362.8 <g < 507.4. 


REFERENCES 


[1] B. F. Kimpatt, “Sufficient statistical estimation functions for the parameters of the 
distribution of maximum values,’’ Annals of Math. Stat., Vol. 17 (1946), pp. 
299-309. 

[2] S.S. Witks, Mathematical Statistics, Princeton Univ. Press, 1943, p. 139. 








NOTES 


This section is devoted to brief research and expository articles and other short items. 


Ce RR ne Se 


TESTS OF INDEPENDENCE IN CONTINGENCY TABLES 
AS UNCONDITIONAL TESTS 


By A. M. Moop 
Iowa State College! 


Summary and introduction. Since the ordinary tests for independence in 
contingency tables use test criteria whose distributions depend on unknown 
parameters, the justification for the tests is usually made either by an appeal to 
asymptotic theory or by interpreting the tests as conditional tests. The latter 
approach employs the conditional distribution of the cell frequencies given the 
marginal totals, and was first described by Fisher [1]. The purpose of the 
present note is to show how these tests may be regarded as unconditional tests 
even though the parameters are unknown by augmenting the test criterion to 
include estimates of the unknown parameters. We present no new tests, 


merely a new setting for the oid tests which seems to put them in a little better 
light. 


1. Certain conditional tests. A variate or set of variates x has a probability 
density function f(x; @) under a null hypothesis involving a parameter or set of 
parameters 6. When the parameters have a set of sufficient estimators 6, the 
joint density function of a random sample of size n may be put in the form 


(1) Ul f(a; ; 0) = gla, 22, --+ , tn | KG; 8). 


t=1 


It is assumed that n exceeds the number of parameters. We shall be concerned 
with the class of test criteria which are not functions of the estimators alone. 
Let (11 , 22, -** , Xn) be a test criterion which may not be put in the form \(6). 
The joint density function for \ and 6, obtained by summing (1) for fixed \ and 
6, will be of the form . 


(2) k(d | 6)h(6; 4). 
The marginal distribution of \ will be denoted by m(A; 4), the result of summing 
(2) over 6 for fixed 2. 

In order to test the hypothesis in question one would like to divide the A 
space into two regions, an acceptance region S, and a critical region S, in such a 
way that S, would have a prescribed size a under the null hypothesis. One 
would of course set up other specifications to be fulfilled by S., but we are 


1 The author is now with The RAND Corporation, Santa Monica, California. 
114 


TESTS OF INDEPENDENCE 115 


interested here only in the fact that the size of S, cannot be determined because 
of the presence of the unknown parameters @ in m(A; 6). 

One can set up a conditional test by using the conditional distribution k(a | 6). 
That is, for fixed 6, the measure of any region R(6) (which is measurable relative 
to k(\ | 6), say, in the Lebesgue-Stielties sense) of the \ space is known because 
the 6 are known in any given instance. Thus a conditional test can be made 
with a critical region R.(6) of prescribed size.. 

The conditional test may be interpreted as an unconditional test in the present 
instance in the following manner: the unconditional test is made by using the 
double criterion (A, 6). The (A, 6) space is divided into two regions, T. for 
acceptance and T’ for rejection. The critical region 7. consists of all points 
(, 6) such that \ is contained in R,(6). If the size of R.(6) is a for all 6, then 


the size of 7. is also a, for 
/ [ ko | 6)h(6; @) dd dé | If _ kA| 4 an| h(; 0) dé 
Te — 2 Re (6) 

[ ab; 6) a 


(3) 


= d. 


In this way one can make an unconditional test of the hypothesis with a critical 
region of prescribed size; of course one does not have complete freedom to 
specify the shape of T., but he can control it to the extent that R.(é) may be 
chosen arbitrarily for every 6. TT, is of course a similar region in the sense of 
Neyman and Pearson [2, 3, 4] for the augmented criterion, and the construction 
of T. is essentially the same as that used by Neyman and Pearson to test param- 
eters with sufficient estimators. 


2. Application to contingency tables. As an illustration we shall follow 
Wilks’ [5] treatment of a two-way table with r rows and c columns; the cell 
frequencies are n;; and the cell probabilities are p;; with 


dni = 7; > pi = 1; 2=1,2,---,n j=1,2,---,6¢. (4) 


The sample is thus regarded as having come from a multinomial population. 
We let 


(5) pi. = ie Pii 5 p.4 = a Pii 3 ~~ = x Nii 5 ay / Nij- 
3 t 7 2 


The null hypothesis Hy (of independence) corresponds to the subspace for which 








(6) Pii = Didi 5 lp=1l=2q; 
in the parameter space of the p;;. The likelihood ratio criterion for testing Ho is 
(7) 5 


n” Tin"?! 








116 A. M. MOOD 


and its distribution depends on the unknown parameters p; and q;. However 
the parameters have sufficient estimators 
(8) pi = nj./n, qj = n.;/n 


for the marginal distribution of the n;. and n.; is 





(9) (IIn;. !) (Tn. ;!) (Ip,""") (g;*"*) 


and when this is divided into the distribution of the n;; (under the null hypothe- 
sis) one finds the conditional distribution of the n;; to be 


(Tn;. !) (In. ;!) 
n'IIn,;! 


which is independent of the parameters. The distribution (10) is just the 
combinatorial distribution used ordinarily in deriving the distribution of \ 
for small samples. The test for independence is therefore a conditional test 
which however may be interpreted as an unconditional test if the criterion ) is 
augmented by the estimators of the parameters under the null hypothesis. 
Instead of the likelihood ratio criterion Karl Pearson’s Chi-square criterion 
could just as well have been used since its conditional distribution is also deter- 
mined by (10). 

The usual difficulty due to discreteness arises in this application to contingency 
tables. It is not possible to make the significance level exactly a. In terms 
of the notation of the first section, R.(6) cannot be chosen so that it will have 
size exactly equal to a for all 6. One would ordinarily replace the equalities by 
inequalities. The R.(@) would be chosen to have size less than but as close to a 
as possible. The size of 7, is then unspecified and one can only state that his 
significance level is less than a. This difficulty is not particularly serious in 
practice unless the test criterion has only one degree of freedom. 


(10) g(mu , M12, °** , Me | M. M2, *°* y Nee) = 


REFERENCES 


[1] R. A. FisHer, Statistical Methods for Research Workers, Oliver and Boyd, London, 
1946, pp. 96, 97. 

[2] J. NEyMAN AND E. 8. Pearson, ‘“‘On the problem of the most efficient tests of statistical 
hypotheses,’’ Roy. Soc. Phil. Trans., Series A, Vol. 231 (1933), p. 289. 

[3] J. NeEyMan AND E. S. Pearson, ‘‘Sufficient statistics and uniformly most powerful 
tests of statistical hypotheses,’’ Stat. Res. Memoirs, Vol. 1 (1936), p. 113. 

[4] J. NeyMAN, ‘‘Outline of a theory of statistical estimation based on the classical theory 
of probability,’”’ Roy. Soc. Phil. Trans., Series A, Vol. 236 (1937), p. 364. 

[5] S. S. Witks, Mathematical Statistics, Princeton University Press, 1943, pp. 213-220. 


5% SIGNIFICANCE LEVELS 117 


THE 5% SIGNIFICANCE LEVELS FOR SUMS OF SQUARES 
OF RANK DIFFERENCES AND A CORRECTION 


By Epwin G. OLps 
Carnegie Institute of Technology 


About ten years ago this author published a paper [1], containing tables for 
use in testing the significance of the rank correlation coefficient. In a paper on 
non-parametric tests, [2, p. 316] Scheffé remarks that it would be desirable 
to have these tables extended by inclusion of the 5% values. When the com- 
putation was begun it was noted that a necessary formula was given incorrectly. 
The main purpose of this note is to correct the formula and to extend Table V, 
[1, p. 148]. Incidentally, a minor addition for Table III, [1, p. 143] will be 
supplied. 

The formula for the rank correlation coefficient, r’, is given by 


yng ~ Shee, 


ni—-n 





where n is the number of individuals ranked and =d° = >» d; (d; being the rans 
i=] 


difference for the 7th individual). As noted in the original paper, the nuil 
hypothesis, r’ = 0, is equivalent to the hypothesis 2d’ = (n? — n)/6, and the 
latter hypothesis is slightly more convenient to test. Scheffé’s remark seems 
to be directed at Table V, which gives, for 11 < n < 30, pairs of values between 
which 2d’ has a probability, P, of being included. Values are tabled for P = .99, 
.98, .96, .90 and .80. The necessary values for P = .95 are given below and 
can easily be copied in the left-hand margin of the original Table [1, p. 148]. 
These values, as in the previous case, have been calculated by using the fact that 


pote _w—n 

; 2 12 
has an approximately normal distribution with a mean of zero and a variance of 
(n — 1)[n(n + 1)/12). In the original paper, [1, p. 142] the denominator 
in the bracketed part of the variance was printed as 6, instead of 12. 

In this author’s original paper the exact frequencies of sums of squares of 
rank differences were given for n = 2 to n = 7 inclusive, [1, p. 139]. The same 
results, together with the results for n = 8, were obtained (independently) by 
Kendall and others and published some months later, [8, p. 255]. Therefore, 
it is possible to extend slightly the comparison of approximating functions 
given in Table III, [1, p. 143]. Using Kendalls results for n = 8 it is found 
that when the approximations obtained by using a Pearson Type II curve are 
compared with exact results the average and maximum differences of cumulatives 
are .0013 and .0067 respectively. When approximations are made by using the 
normal curve the corresponding errors are .0081 and .0163. 





118 EDWIN G. OLDS 


REFERENCES 

[1] E. G. Oups, ‘‘Distribution of the sums of squares of rank differences for small numbers 
of individuals,’’ Annals of Math. Stat., Vol. 9 (1938), pp. 133-148. 

2) H. Scuerr®, ‘‘Statistical inference in the non-parametric case,’’ Annals of Math. Stat., 
Vol. 14 (1943), pp. 305-332. 

[3] M. G. KenpaLu, SHEILA F. H. KENDALL AND B. BaBineton Smita, ‘‘The distribution 
of Spearman’s coefficient of rank correlation in a universe in which all rankings 
occur an equal number of times,’’ Biometrika, Vol. 30 (1939), pp. 251-273. 


TABLE V (Extended) 
Poire of values between which Ze" has « probability, F, of being tinciuded _ 




















n P = .95 

11 | 83.6 356.4 
12 117.0 | 455.0 
13 158.0 | 570.0 
14 | 207 .7 702.3 
15 | 266.7 853.3 
16 | 335.9 1024.1 
17 | 416.2 1215.8 
18 | 508 .4 | 1429.6 
19 | 613.3 | 1666.7 
20 | 732.0 | 1928 .0 
21 | 865.1 | 2214.9 
22 | 1013.5 | 2528.5 
23 | 1178.2 | 2869.8 
24 | 1360.0 3240.0 
25 | 1559.8 | 3640.2 
26 | 1778.4 | 4071.6 
27 | 2016.7 4535.3 
28 | 02275.7 5032.3 
29 2556.2 5563.8 
30 | 2859 .0 | 6131.0 











TT 
eT 


NON NEGATIVE QUADRATIC FORMS 119 


INDEPENDENCE OF NON-NEGATIVE QUADRATIC FORMS IN 
NORMALLY CORRELATED VARIABLES 


By Berri MatTérn 
Forest Research Institute, Experimentalfaltet, Sweden 


In a recent paper by the author [5] the following theorem has been mentioned 
without proof. Though the theorem is very simple and easy to prove the 
author has not found it elsewhere in the literature. 

THEOREM. If two non-negative quadratic forms in normally correlated variables 
with zero means are uncorrelated the two forms are independent. 

To prove the theorem, let the two forms be 


(1) Qi = » 2d ij Xi Xj, Qe = dX a bij;2i2;, 

i=l j= t=1 j= 
where the 2;’s are normally correlated and all have mean 0. By a well-known 
theorem on quadratic forms we can reduce Q; and Q» to the forms 


(2) Qi = De civi, Qo = Dd dizi, 
i= i=1 


where the y,’s and 2;’s are linear functions of the z,’s. In the 2n-dimensional 
normal distribution of the y;’s and the 2z,’s, let p;; be the covariance of y; and 
z;. It is then easily shown that the covariance of y; and 2; is 2p;; , and hence 
that 


(3) cov (Qi, Qs) = 22) 2) ed; vii. 

As the forms are supposed to be non-negative all coefficients in (2) are non- 
negative. If Q: and Qe are uncorrelated, each term on the right hand of (3) 
must vanish. Consequently, if c; ~ 0 and d; ~ 0, we must have p;; = 0. This 
means that all y;’s in Q; with non-zero coefficients are independent of all z,;’s in 
Q. with non-zero coefficients. Hence Q; and Q2 are independent. Q.E.D. 

To see if Q; and Q. are uncorrelated we need an expression for the covariance 
of the two forms in terms of the coefficients in (1) and the variances and co- 
variances of the original variables x;. Let A and B be the matrices of the two 
forms (1). Clearly we may suppose A and B to besymmetric. Let the variance- 
covariance matrix of the 2z,’s be L. By straightforward calculations we find 


(4) cov (Q, , Qo) = 2 Tr ALBL. 


Here we have used 7'r M to denote the “trace,” i.e. the sum of the diagonal 
elements in a square matrix M. In case of independent variables with variance 1, 
we get 

(5) cov (Q;, Qe) = 2 Tr AB. 


The formulae (4) and (5) are given in [5]. 








120 HERMANN VON SCHELLING 


It is interesting to note the simplification of the independence condition given 
in [2, 5] which is possible when the forms are assumed to be non-negative. It 
may also be of interest to note that the condition for independence given in 
the present theorem is identical with the corresponding condition for two linear 
forms. (In fact, the latter condition has been used in the above proof.) Further 
we observe that if Q2 is the square of a linear form with mean 0, we get a necessary 
and sufficient condition for independence be*ween a linear form and a non- 
negative quadratic form. The corresponding condition when Q, is not supposed 
to be non-negative has been given in [4]. 

As an application consider a quadratic form Q in normally correlated variables. 
Let it be known that Q has a y’-distribution with f degrees of freedom. If 
further 


(6) QG=04+Q4+ ---+94,, 

where the Q,’s are non-negative and mutually uncorrelated quadratic forms, 
then each Q; has a x’-distribution with f; degrees of freedom, say, and Sf; = f. 
The proof with the aid of the above theorem is almost immediate. We thus 


get another formulation of the theorem of Cochran [1] on the decomposition of a 
quadratic form. 


REFERENCES 

[1] W. G. Cocuran, “Distribution of quadratic forms in a normal system with applications 
to the analysis of covariance,’’ Proc. Cambr. Phil. Soc., Vol. 30 (1934), pp. 178-191. 

[2] A. T. Crate, ‘‘Note on the independence of certain quadratic forms,’’ Annals of Math. 
Stat., Vol. 14 (1943), pp. 195-197. 

[3] H. Hore.uine, ‘‘Note on a matric theorem of A. T. Craig,’’ Annals of Math. Stat., 
Vol. 15 (1944), pp. 427-429. 

[4] M. Kac, ‘“‘A remark on independence of linear and quadratic forms involving inde- 
pendent gaussian variables,’’ Annals of Math. Stat., Vol. 16 (1945), pp. 400-401. 

[5] B. Matt&rn, ‘‘Metoder att uppskatta noggrannheten vid linje- och provytetaxering”’ 
(“Methods of estimating the accuracy of line and sample plot surveys’’), 
Meddelanden fran Statens Skogsforskningsinstitut, Vol. 36 (1947), pp. 1-188. 


(a 


A FORMULA FOR THE PARTIAL SUMS OF SOME 
HYPERGEOMETRIC SERIES 


By HrRMANN VON SCHELLING 
, ° r 1 
Naval Medical Research Laboratory, New London, Conn. 
Let an urn contain N balls of which are a black and b white. A single ball 
is drawn. We note its color, return the ball into the urn and add A balls of the 


same color. The probability w(n:) to obtain nm, black balls in n trials is given 
by a formula due to F. Eggenberger and G. Polya [1]: 


1 Opinions or conclusions contained in this paper are those of the author. They are not 
to be construed as necessarily reflecting the views or endorsement of the Navy Department. 


ry, 


HYPERGEOMETRIC PARTIAL SUMS 121 


N(N + A)---[N + (n — 1)d] 


(n fixed, n, variable). 


(1) w(n) = ie a(a+A)---[a+ (m — 1A]-b0 + 4)--- [b+ (n—m — AI 


Now, we fix m and ask for the probability that the nth black ball appears at 
the nth drawing. We find 


w(n) 


(2) -(* — os -+[a + (m — 1)4].b(6 +.A)---[b + (x — m — 1d] 
7" N(N + d)---[N + (n — 1)d] 


(n, fixed, n variable) 





This function is the (n — m + 1)th element of the series 
a fa a 
@(441)--[£+m— | * b N 
N N N of Mizra t mil ; 
V(X +1) |X +m = 0] 


Consequently, the probability that the n;th black ball appears at the latest in the 
nth drawing reads, with an obvious abbreviation, 





n 


W(n) = >» w(i) 


i=n, 
a 


a 
(3) _S(S41)- Ls +m — 0 .F ° me ;3 
- — n—ny+1 Ma? A mM 5 . 


(1) emo) 


Now, we assume the nth black ball did not appear in the nth drawing. What 
is the alternative? The (n — m + 1)th white ball must have appeared in the 
nth drawing at latest. The corresponding probability is according to the 
equation (3) 


n 


Winy= DL wi) 


t=n—nj+1 


(4) a; +): [ + —m) | ( a N 

. . F,,(n —-m+1,—-,— 
N 1 N + (n — n) * 
a xt — 


The relation (4) originates from (3) by writing b instead of a and (n — m + 1) 
instead of nm. The alternatives add to certainty: 


(5) W(n) + W(n) = 





+n—m-+151). 








122 HERMANN VON SCHELLING 


Change the notations in the following manner: 

(6) Ny a, 5 8; N+ m 7; n—-m+til—-»p. 
From (6.1) and (6.4) find by addition 

(7) n—vt+a-—l. 

From (6.1) and (6.3) 


(8) ——->y— a. 


From (6.2) and (8) 
(9) a Nb 


Formula (5) reads now 


(y —a — Oy — a — 8+ 1)---y¥ —-B-1) 2 
Gaal —aFi (oh he & ad 


(10) 4 B(6 + 1)---@+»—1) 
way - 04 ++ -ate~ 


-Fa(v,y — 8B —a,y —a+7;1) =1, 


F(a, 8, y; 1) denotes the partial sum of the first v elements of the hypergeometric 
series F(a, 8,y;1). It isto be mentioned that a is a positive integer necessarily 
as follows from (6.1). Since 


outa ae ee esse o ~ 6-5 | 
ne G-aG-atl--@—-1 mand, 


the relation (10) can be written 


(11) F(a, B,¥31)  Falv,y — 8 —ay-aty3l) _ |, 
' F(a, B, y; 1) Fi, 7 — 8B — a, ¥ — a + »;1) , 
where » and a are positive integers. 

This result is not interesting from the standpoint of pure mathematics since 
the sum F(a, 8, y; 1) isknown. But the relation is useful for the statisticians. 
In calculating the function W(n) they need a sum of n, elements instead of 
(n — ny + 1). If m is small (and this holds in practical applications), the 
exact calculation of W(n) is possible for every n. 








REFERENCES 


[1] F. EGGENBERGER AND G. Po ya, Zeits. f. angew. Math. und Mech, Vol. 3 (1923), pp. 
279-289. 


VARIANCE OF PROPORTIONS OF SAMPLES 123 


THE VARIANCE OF THE PROPORTIONS OF SAMPLES FALLING 
WITHIN A FIXED INTERVAL FOR A 
NORMAL POPULATION 
By G. A. BAKER 


University of California, Davis 


Suppose that we have a normal population 


i ; (a — m)* 
(1) oT 0 V/2e exp{— oo 


and we draw samples of N from this population. We wish to estimate the 
proportion, p, of the population between two fixed limits, m + Ao and m + uo. 
One way to make this estimate is simply to count the number of observed 2’s 
which fall in this interval. We shall denote this number by n. Then the ratio 


(2) n/N 

is an estimate of p. If this is done the variance of p is well known to be 
p(l — p) 

(3) N ” 


The method of estimating p by counting the number in a definite interval is 
nonparametric and requires no assumption of normal or other specified type of 
sampled population for validity. However, if we know that the sampled 
population is normal then we may make use of this knowledge in estimating p 
and possibly obtain an improved estimate. 

Another way to estimate p which makes use of the form of the sampled 
population is to compute 


& 
ll 


M:= 
2 


(4) 


to 


M- 


(x3 — 2)’, 


zm el 
i 


=] 


and hence the integral 


mt+yo —(z—z)?/ (282) 
m+ ho Ss V/ 26 


It is implied in elementary texts that (5) is a better estimate of p than is (2) 
although this point is not discussed. 

It is the purpose of the present note to discuss the variance of the estimate (5) 
and compare this variance with (3). 

Now (5) is a function of the first two moments of the sample and it follows 
from an application of a theorem stated by H. Cramér [1] that (5) is asymptot- 
ically normal with mean p and variance given by 











124 G. A. BAKER 


sb? = tnt? 1 1,2 
(6) bee | ot +e - or 





2 


To compare the relative efficiency of the counting method with (6) in complete 
detail would be somewhat tedious. The referee suggests a brief discussion of 
the cases \ = — ©, where we are counting the proportion less than some known 
value, and \ = —uy, where a portion out of the middle of the distribution is being 
counted. These cases are of particular practical interest. 

If \ = — ~, then (6) becomes 


> 2nN L2 ; 


We choose values of u as indicated below: 


Le p Relative Efficiency of (3) 
— 2.3263 0.01 0.27 
— 1.2816 0.1 0.56 
— 0.8416 0.2 0.66 
— 0.5244 0.3 0.75 
— 0.2533 0.4 0.64 
0.0000 0.5 0.64 





We get values of the relative efficiency of (3) that are low for small p and some- 
what higher for larger values of p. 
If \ =—un, then (6) becomes 


(8) = we™ 
. aN 


We choose values of yu as indicated below: 





| 








fm | ? | Relative Efficiency of (3) 
| 
1.2816 | 0.8 0.63 
0.8416 | 0.6 | 0.46 


0.2533 . 0.2 0.12 


We see that the relative efficiency of (3) ranges from close to 0.75 to rather small 
values. 

Other choices of \ and yu yield relative efficiencies of about the same order of 
magnitude as those illustrated. 


REFERENCE 


[1] Haratp Cramkr, ‘‘Mathematical Methods of Statistics,’? Princeton University Press, 
1946, section 28.4, pp. 366-367. 


lete 
| of 
wh 
ing 


of 


Sy 


BISERIAL COEFFICIENT OF CORRELATION 125 
THE POINT BISERIAL COEFFICIENT OF CORRELATION 


By JosepH Lrv 


New York State Department of Civil Service 


The product moment coefficient of correlation between a continuous variate y 
and a variate x which takes the values 1 and 0 only, is known in psychological 


statistics as the point biserial coefficient of correlation. Let y;,7 = 1, --- ,n, 
be observations on y; y1i, 2 = 1, +--+ , m, be y values which are paired with the 
value x = 1; yor, 2 = 1, +--+ ,m, be values paired with x = 0; 9%, J: , and J be 


the corresponding means; and n = nm + m. Then the point biserial coefficient 
of correlation may be written 


MN /- , 
4/2 (Hi — Go) 
ay) lone, 
b ¢ (yi; = | 
i=0 7=1 


The distribution of r is readily obtained when the y;, 7 = 1,---,n, are 
distributed as 


1 


— 1 ? 
(2) View - a" er ns pers | 


where 
N : 
— = Vi + Rye es 
Oz n ; 
4/2 7=m+1,m+ 2,---,n, 


o is the variance of the y; about the common mean a, and p is the parameter 
which represents the correlation between the y; and the z;. It is easy to verify 
that the statistic in (1) is a maximum likelihood estimate of p. 

It will be convenient to express the two population means in (2) as mw; and yo 
so that 





No - 
Hi = a + po — 


(3) = 
Ny 
Mo = Q@ — po —. 
No 
Hence 


pence 


(4) am f= Mm wi — Ho 
nm o 








126 JOSEPH LEV 


Now write 


Ny Ny v - a 
ard _%5 iene 
(5) im V ” ” = hi da Vn — 2r 


_ <. #&# 3 a? . 
= al i- 
[2S ya — 99" 
i=0 j=1 
where r is obtained from (1). 

Using (5) we may write ¢ as 


(G1 — Go) — (ur — mo) | a 


Tn —; n ———— 
——— gest - i ree A a. a 
as V x No Vv . Ve ¥ , 


ni 


1 4 
De (yis — G0” 


— 7 
siete 
oV/1 — PP 


Therefore ¢ has non-central ¢ distribution [1] with 








Mi — Ho p 
6 6 Ul y——=s CS = v/n ee eee 
(6) ac vi-9 
—oV1— p 
nN No 


The methods and tables given in [1] may be used to calculate tests of significance 
and confidence limits for p. 

When p = 0, ¢ has Student’s distribution, and the statistic £ = ~/n — 2r/ 
4/1 — r? may be used to test the hypothesis, p = 0, by means of the ¢ tables 
with n — 2 degrees of freedom. The non-central ¢ distribution then determines 
the power function of this test. 

Table IV of [1] can be used to calculate confidence limits for p. If the con- 
fidence interval is to be based on equal tails of the distribution choose a confidence 
coefficient 1 — 2e. Then compute 6(f, & ,,e) and 6(f, t& , 1 — e), wheref = n — 2, 
and f = Un — 2r/VY1 — Pr. 

A lower limit for p is given by 





6, bo ’ €) 


[n + &(f, fw, €))*’ 


and an upper limit by 
bf, to, _~ €) 


in + PU, 1 — OF 
REFERENCE 


[1] N. L. Jonson anv B. L. We cn, ‘Applications of non-central ¢-distribution,’’ Bio- 
metrika, Vol. 31 (1940), pp. 362-389. 





— 


ce 


°/ 


oS 


h~ 
e 
,, 


MEAN DEVIATION 127 


A NOTE ON KAC’S DERIVATION OF THE DISTRIBUTION OF THE 
MEAN DEVIATION 


By H. J. Gopwin 
University College of Swansea, Wales 


In a paper on a general class of estimates of deviations, Kac [3] obtained an 
expression for the frequency function of the estimate of mean deviation from 
the mean in normal samples. He was unable to establish the identity of this 
with an expression obtained earlier by me [1]. I now shew that the two results 
are, in fact, equivalent. 

Kac uses the functions ¢“’ (x), defined as the k — fold convolution of 


(0, se <¢; 


§(z) = 4 : 
| erin? », £26 


I used the functions G(x) defined by the recurrence relation 
(1) G(r) =1, Ge) = [ e™™ GD at 
0 


Now I have shewn elsewhere [2] that the integral of ¢*i+"""+*® taken through 
the interior of a regular simplex in k dimensions, with its centroid at the origin 
and of sidea,is~/k + 1G;,(a/+/2). The relation (1) corresponds to a dissection 
of the simplex into sections, which are (k — 1)-dimensional simplexes, by joining 
the centroid to the vertices and taking sections parallel to the base of each of the 
(k + 1) smaller simplexes so formed. If however we take sections parallel 
to a base of the whole simplex we get another recurrence relation, viz. 


(2) G, (x) = [ e7(he- RD OPRED CY (4) dt. 


Now (2) may be re-written 


—(n2zx2/2(k+1)) z —(n2t2/2k) 
ae [ go int(e—0)2/2) G,1(nt)e 
n* 0 nk-1 





dt 


: : —(n2z2 /2k) 
whence, by induction, G;-1(nz)-e°” wie 


of Kac’s result to mine is established. 


= n*'¢(z) and the equivalence 


REFERENCES 


[1] H. J. Gopwin, “On the distribution of the estimate of mean deviation obtained from 
samples from a normal population,” Biometrika, Vol. 33 (1945), pp. 254-256. 

[2] H. J. Gopwin, “A further note on the mean deviation,’’ Biometrika, Vol. 35 (in the 
press). 

[3] M. Kac, ‘‘On the characteristic functions of the distributions of estimates of various 
deviations in samples from a normal population,’’ Annals of Math. Stat., Vol. 19 
(1948), pp. 257-261. 








128 A. M. PEISER 


CORRECTION TO “ASYMPTOTIC FORMULAS FOR SIGNIFICANCE 
LEVELS OF CERTAIN DISTRIBUTIONS” 


By A. M. PEISER 
New York City 


Professor Henry Scheffé has recently pointed out to me an error in my paper 
“Asymptotic formulas for significance levels of certain distributions,” which 
appeared in Annals of Math. Stat., Vol. 14 (1943), pp. 56-62. In the determina- 
tion of the significance levels of Student’s ¢ distribution, appeal was made to a 
theorem of Cramér which requires independent random variables. The variables 
defined at the top of page 61, however, cannot be taken as independent, so that 
the theorem does not apply. 

The asymptotic formula (following the notation of the paper) 


3 
a 1 
bn =p BEM 40 (2), 


n 


where 
1 « . 
Wei, w= fs, 


is nevertheless correct. This may be shown directly from the distribution 
function 


1 1(i(n + 1)) (1 rye di 


n 


Gale) = 4+ Tran) 


n+1 t 
- 2 '° ¢ . “) 


nt+1fti_¢ f° ) 
id ! 2 (; Qn? - 3(n + 68)3/ |’ |@| <1, 


and using Stirling’s formula, it follows that G,(x) can be written in the form 


1 * 9 42/9 
Gt) = 3+ Ge fe E 


o(a,) — 22 oe aa i (De? 
_™ 4n+/2n © T 2r Jo One 4 


Writing 


f° —(n+ly/2 
0+" 
n 


| 
© 
4 
So 








fi — 2? — 1 


1 
an + ne ant | dt 





where Q,,(t) is a bounded function of ¢ and n in each finite interval. | 
Let tp.n = Yp + Gn, Where a, = o(1). Then G,(t,,.) = &(yp) = 1 — p, and | 
we have | 





P(Yp + An) — (yp) 


Nan 
ay, 


so that 


This is the required result. 


CORRECTION 129 





(Yp + a,)* + (Yp + Gn) —}(ypt+an)? 1 
= 44/2 é + O n ’ 
3 
lim Mu, = Yn + Yp é 
ao 4 








ABSTRACTS OF PAPERS 


(Abstracts of papers presented at the Seattle Meeting of the Institute on 
November 26-27, 1948) 


1. Estimation of the Variance of the Bivariate Normal Distribution. Harry 
M. Huaues, University of California, Berkeley. 


Let x; and zz be two random variables normally distributed with known means m, and m , 
and with common unknown variance o?. Consider an experiment in which the observed 
variable is Y = VJ (2, — m,)? + (42 — mz)?. This paper considers the problem of estimating 
the parameter o when the observations are grouped. By the method of minimum reduced 
chi-square with linear restrictions, two best asymptotically normal estimates are derived. 
By minimization of the asymptotic variance of these cstimates, the optimum choice of 
grouping is found as a function of ¢. For two and for three groups, when it is known or 
assumed a priori that o is on a certain finite interval, the optimum grouping is derived 
which will minimize the maximum asymptotic variance on that interval. If such interval 
is moderately small, it is shown that the optimum grouping is the same as if ¢ were known to 


have the value at the upper end of the interval. Finally the effect of using non-optimum 
grouping is analyzed. 


2. Derivation of a Broad Class of Consistent Estimates. R.C. Davis, Inyo- 
kern, California. 


Given a chance vextor X with cumulative distribution function F(X, 6), where @ is an 
unknown parameter vector, a broad class of estimates of @ is derived having the following 
properties: a) any estimate in this class is a consistent estimate of 6; b) any estimate is a 
symmetric function of independent observations of the chance vector X. The novel feature 
of this class is that no assumptions about the existence of various partial derivatives of a 
density function with respect to 6 are made. As a matter of fact not even the existence of 
a density function is required, and it is postulated merely that a continuous function of X 
for each 6 (in a certain neighborhood of the true 4) and of 6 for each X exist which satisfies 
a Lipschitz condition in 6. For each such function having a finite first moment an estimate 
of @ is constructed which has the properties a) and b) listed above. 


3. Locally Best Unbiased Estimates. Epwarp W. Barankin, University of 
California, Berkeley. 


Let p = {ps(x); 3 € 6} be a family of probability densities in the space Q of points z; 
and gafunction on®@. Lets be fixed apd >1; call an unbiased estimate of g best at Jo if its 
s-th absolute central moment (s.a.c.m.) under ps, is (finite and) not greater than the 
s.a.c.m., under pv, of any unbiased estimate of g. With a certain integrability postulate 
on the ps’s, a necessary and sufficient condition, of finite character, is established for the 
existence of an unbiased estimate of g having a finite s.a.c.m. under p3,._ When such a one 
exists, there then exists a unique unbiased estimate which is best at do. The existence 
condition defines the s.a.c.m. of the best estimate explicitly as the ].u.b. of a set of numbers; 
in particular, we obtain immediately a generalization of the Cramér-Rao inequality. Also, 
when it exists, the best unbiased estimate is explicitly constructed from the function g 
and the densities ps . The case s = 2 is studied more closely. Also, a detailed example 
is considered. 


130 


tior 
var 


o/} 
o*\s 


as ) 


Vol 
tio 
an 


is | 


un 
st: 


wi 


ABSTRACTS OF PAPERS 131 


4. Some Problems Related to the Distribution of a Random Number of Random 
Variables. [pwarp Pautson, University of Washington, Seattle. 


Let {xi} (¢ = 1,2,3 --- ) be aset of independent random variables with identical distribu- 
tions, with K(x) = aando*(x) =bO<b< «). Let N bea positive integral-valued random 
variable with distribution /’,(N) depending on a parameter \, where E(N) = A), and 
o(N) = BO < By < &). Nowlet Ty = 4% +a2+--- + ay. 
as\— ~ of - ~ = has been investigated by Robbins (Proc. Nat. Acad. Science, 

VaBy + bAn 
Vol. 34 (April 1948), pp. 162-163) for several different sets of conditions on the distribu- 
tion of NV. It can be shown that analogous results will hold if instead of 7'y we consider 
amore general statistic 7°) , whose conditional distribution with respect to the variable N 
is such that there exist constants a; and 6; so that 


( p* 3 

Ty — aN 
lim EY exp i( x = ) \ = h(t) 
N— 0 \ V biN ) 


uniformly in any finite ¢ interval, where h(t) is a characteristic function. Returning to the 
statistic 7'y , it can be shown that there exists an asymptotic expansion in powers of \~} 


( al 
° ° T'n = aA, 
with remainder O(A~“/2)+1) for Pp? <u 


\Va?By + bAr. — 
satisfied: (1) the distribution function of x has a non-zero absolutely continuous compo- 
nent, (2) H(| a |*) < ©, and (3) \ > « through integral values, and /’,(N) is the Ath con- 
volution of a random variable n such that H(n*) < «. 


The limiting distribution 


when the following conditions are 


5. Asymptotic Expansions for the Distribution of Certain Likelihood Ratio 
Statistics. AuLsert H. Bowker, Stanford University. 


Asymptotic expansions of the ‘‘Cramerian’”’ type are derived for the distribution of 
likelihood ratio statistics given by Wilks for testing various hypotheses about means, 
variances, and covariances of a normal multivariate distribution. The point of departure 
is Wilks’ result that minus twice the logarithm of the likelihood ratio has the x? distribution; 

1 1 


terms in --» may also be expressed in terms of the x? distribution. In addition, 


N ? N2 ? 
asymptotic expansions of the ‘‘ Fisher-Cornish”’ type are obtained for the percentage points 
and for a transformation of the statistic to a x? variate. 


6. On a Problem of Confounding in Symmetrical Factorial Design. Ksrurer 
SEIDEN, University of California, Berkeley. 


Let m3(r, s) be the maximum number of factors that is possible to accommodate in 
symmetrical factorial experiment in which each factor is at s = p” levels (p being any 
positive prime number, 7 a positive integer) and each block is of size s’, without confound- 
ing any degrees of freedom belonging to any interaction involving 3 or lesser number of 
factors. 

R. C. Bose proved in a paper ‘‘ Mathematical theory of factorial design,’’ Sankhyd, Vol. 8 
(1947), pp. 107-166, that the following inequality holds: 

s? +1 < m;3(4,s) S$ s?+s+2. This givesin cases = 4,17 S m3(4, 4) S 22. It is now 
proved that m3(4, 4) = 17. 

The proof consists in showing that the maximum number of non three collinear points 
which can be chosen in a finite projective space PG(3, 4) cannot exceed 17, which according 
to a proof of R. C. Bose is equivalent to the staement that ms (4, 4) cannot exceed 17. 








132 ABSTRACTS OF PAPERS 


“J 


Some Bounded Significance Level Tests of Whether the Largest Observa- 
tions of a Set are Too Small (Preliminary Report) Joan E. Wausu, Santa 
Moniea, California. 

A set of n observations are given which satisfy: (1) They are independent and from 
continuous symmetrical populations; (2) The r largest observations are from populations 
with median @ while the remaining observations are from populations with median ¢. It 


is required to test whether @<¢. Let x(1),--- , 2(n) denote the observations arranged 
in increasing order of magnitude. For r = 1 tests of the form: Accept 0 < ¢ if x(n) < 
2r(wa) — x(i), where a = Pr[x(n) + x(t) < 20/6 = ¢g| and wz ts the smallest integer 


satisfying Pr[x(w.) > 6|0=¢] <a, canbe obtained from n > 15. Exact significance levels 
can be obtained by assuming a sample from a specified population (e.g. normal). On the 
basis of (1)-(2) alone, the significance level never exceeds 2a. For large n, tests can be ob- 
tained for any r if the observations satisfy the additional weak condition: (3) The tail order 
statistics are approximately independent of the central order statistics; also the variances 
of the tail order statistics are either very large or very small compared with the vari- 
ances of the central order statistics. The test is: Accept 0 < gif max{[x(i;) + a(n — jr); 
l<k<s< rj < 2x(w.), where tu < tugs Jr < Jogi 5Js = 7 — 1, Wats the smallest integer sat- 
isfying Pr[z(w.) > 0106 =¢] S a, anda = Pr{max[x(ix) + x(n — jx)] < 2010=¢}. For 
large n the significance level is approximately a but is < 2a for all nm. The power function 
—lase —0—-~ and—-Oasye—-0->-~. 


8. Determination of Optimal Test Length to Maximize the Multiple Correla- 
tion. Paut Horst, University of Washington, Seattle. 


If the lengths of the tests in a battery are altered, their intercorrelations and their validi- 
ties or correlations with a criterion are also altered. Consequently, the multiple correlation 
of the battery with the criterion will also be altered. These changes are a function of the 
reliabilities of the tests. Suppose we have given from a set of experimental data (1) the 
time allowed for each test in the battery, (2) the reliability of each test, (3) the intercorrela- 
tions, and (4) the validities of all the tests. If we specify the overall testing time we are 
willing to allow for the test in the future, we can determine the amount by which each test 
must be altered in order to give the maximum multiple correlation with the criterion. The 
method, together with numerical examples and the mathematical proof, is presented, 


9. Some Numerical Comparisons of a Non-Parametric Test with other Tests. 
F. J. Massey, University of Oregon, Eugene. 
Let F(z) be the cumulative distribution function of a R.V.NX, and let 7 < a --- < 24, 
be the results of n independent observations ordered as to size. 
Define S,(x) = O0Oif x < %; , 


=-ifa<r<ut+1; 
t 


? 
= lifz, < 72. 
To test the hypothesis Hy : F(2) = Fo(x), where Fo(x) is completely specified, use the 
criterion: reject /7) if max | S,(2) — Fo(x) | > : . Choice of \ determines the first kind 


Vn 
of error. The second kind of error against specified alternatives can be calculated 
numerically. 


If 


ABSTRACTS OF PAPERS 133 


10. On the Deviation of Extreme Values. W. J. Dixon, University of Oregon, 
Eugene. 

Let x(z) be the ith observation in order of magnitude in a sample of size n. The distri- 

a(n) — x(2), ‘ bi : i lal 
is obtained explicitly for samples from a rectancular distribution 

a(n) — a(1) 
and for n = 3, 4, 5, for samples from a normal distribution. Percentage values of R for 
values of n up to 30 are presented. Generalizations of FR are indicated. 


bution of R = 


11. The Optimum Size of Interval for Making Measurements of a Rocket’s 
Angular Velocity. Hpwarp A. Fay, University of California, Berkeley. 


Over a given range of time 0 < 7 < 7’, the angular velocity of a rocket’s spin is adequately 
represented by a polynomial (7) of given degree s — 1 but with unknown coefficients. 
The rocket’s angular acceleration and the angle through which it spins in a given time- 
interval may then be obtained respectively by differentiating and integrating &(7). Let 
v be aninteger > s, let 4 = 7'/v, and let 7; be the angle through which the rocket turns in the 
interval (¢ — 1)t < + < it. While &(r) and é’(7) cannot be directly observed, the angles 
m,m2,°*: ,7 can. Let Y; be an observation on n; , and assume that Y1, Y2,--- , Yo 
are independent homoscedastie variables. The Y; may then be combined by the method 
of least squares to obtain best linear estimates X(7, t) and X’(7, t) of E(7) and é’(r). The 
choice of t is at the observer’s disposal. For the cases s = 2,3, 4, and for the cases that the 
common variance of the Y; is (a) independent of ¢ or (b) proportional to ¢, an expression is 
derived for the variance of X’(r, tf), and the maximum value of that variance over the 
range of 7 is minimized with respect tot. The method is of much more general application. 


12. Stationary Time Series Analysis and Common Stock Price Forecasting. 
ZENON SzaTrROowSKI, University of Oregon, Eugene. 


The objective of this paper is to present a statistical procedure of practical value in the 
problem of extracting information from the past behaviour of economic time series, informa- 
tion to be used in projecting future patterns. The author feels that his approach yields 
results closer to reality than the techniques deseribed by Herman Wold, M. C. Kendall, 
H. T. Davis, and in particular, the technique of ‘‘ disturbed harmonics” used by G. U. Yule. 
The idea of the proposed technique can be described by examining the autoregression 
scheme, which seems to be considered the most desirable by the above men. A simple 
example of such a scheme is the equation 


Ut4e2 = —AUNty1 — bur -+- Ets2, 


where the u’s are the time series values and £’s are random elements. The above linear 
relationship, when determined either directly or through an empirical correlogram (for 
which data is usually inadequate) is a kind of an average relationship. It may be as inap- 
propriate in estimating future values of a time series as would be an average in estimating 
the level of a series with a pronounced trend. 

The author proposes using derived time series to shed light on the nature of the changes 
in the parameters under consideration. Such derived series could be estimates of the a’s 
and 6’s for successive time periods. The author has found that projections of common 
stock price fluctuations were improved considerably when the changing nature of the 
cyclical pattern was taken into account. This was done by constructing derived time 
series, ‘‘moving”’ estimates of the amplitude, period and phase of the dominant harmonics. 








134 ABSTRACTS OF PAPERS 


The author points out that the above approach has shown promise in commodity prices 
as well as common stocks. The value of this approach in forecasting lies in the facts that 
(1) it does not require forecasts of other series and (2) it is based on the realistic assumption 
that history repeats itself but with variations, variations which may be taken into account 
through appropriate models. 


13. Distribution of the Number of Schools of Fish Caught Per Boat. J. Nry- 
MAN, University of California, Berkeley. 


Let A be the average number of schools of fish per unit area of a fishing ground A. Leta 
be any area partial to A, and let Q(m, a, \) denote the probability that exactly n schools of 
fish will be found within a. At time ¢ = 0 a boat begins scouting for fish in A traveling at 
constant speed v. It is assumed that all schools of fish within distance r of the boat are 
detected and none is detected at a greater distance. Ifs = 1 schools are detected then they 
are caught in turn, the catching of one school taking up exactly h hours. X(t) denotes the 
random variable representing, for each ¢ 2 0, the number of schools caught up to time t 
including the one which may be in the process of being caught at the moment ¢. Probability 
distribution of X(t) is given by the formula 

k 
rixg) <= z, Q[m, 2rv(t — kh), dr] 
nf 
fork = 0,1,2,--- ,n — 1, where n — 1 is the greatest integer smaller than t/h. Of course 
P{X(t) < n} = 1. This result is easily generalized for the joint distribution of catches of 
several boats fishing in the same area so that their paths do not cross. Assuming specific 
functions to represent Q(m, a, ) formulae may be obtained to estimate the parameters \ 
and rv. 


14. Some Problems in Fishery Research to which Statistical Methods are 
Applicable (Preliminary Report). Rautrn P. Stuuiman, U. 8. Fish and 
Wildlife Service, Seattle. 


One of the most difficult problems is the obtaining of a random sample of a fish popula- 
tion. Rarely are such populations randomly distributed over any area, and the samples 
must often be taken from the catches of fishing vessels which do not uniformly cover even a 
part of the area of distribution of the population. Many distributions of variables found in 
fishery research are not normal, and statistical methods based on the normal distribution 
can be applied only through the use of unsatisfactory transformations. Since fishery 
research is largely observational in technique, data reflecting the concurrent effect of 
several variables are usually obtained. Although.the present methocs of multiple correla- 
tion and regression can be used in some instances to measure the relative effect of the 
separate variables, there are many situations in which these methods do not yield good 
results. Finally, many data used in fishery research must be adjusted before use, and 
existing methods do not give good measures of the expected variability of such adjusted 
data. Examples of specific problems are found in the distribution of deliveries and the 
variations in catch of Columbia River chinook salmon. 


15. The Application of the Hypergeometric Distribution to Problems of Esti- 
mating and Comparing Zoological Population Sizes. DouGLAs CHAPMAN, 
University of California, Berkeley. 

Estimates and tests of the x? type, as developed by Neyman, are adapted to sampling 
without replacement from a finite population. These results are applied to problems of 


— Oo ™M 


-——" = we SO 


ABSTRACTS OF PAPERS 135 


estimation and comparison of zoological population sizes as determined by sampling pro- 
cedures. For single samples the bias and variance of different estimators is compared. 
Finally some numerical calculations are made for various population and sample sizes to 
determine how different sample sizes and different methods of analysis affect the size of the 
critical region which is necessarily an approximation to the desired size. For some of 
these the power of the test is considered. 


16. Extension to Multivariate Case of Neyman’s Smooth Test with Astronomical 
Application. Exizaseru L. Scorr, University of California, Rerkeley. 


It is more or less generally accepted that the distribution of extra-galactic nebulae in 
space is not uniform in the small. In particular, counts in small cubes show distinct signs 
of contagion. On the other hand, it is not settled whether or not lack of uniformity in the 
large exists. One way of making this statement precise is to assert that the power series 
expansion of the logarithm of the probability density of the two angular coordinates of the 
nebulae within a given large area on the unit sphere does not contain low order terms. 
In fact, any such low order terms could be interpreted as determining ‘‘trends”’ or what 
could be described as lack of uniformity in the large. From this point of view, uniformity 
in the large may be tested by a two dimensional Neyman Smooth Test for goodness of fit. 

Let {7i;(z, y)} be a sequence of polynomials in z and y ortho-normal for | z| < a and 
ly! <b. If x, and y: are the coordinates of the kth out of n nebulae counted within the 
rectangle (—a,a), (—b,b) then the smooth test of mth order consists of rejecting the hypoth- 


m n 2 

esis of uniformity in the large when > ( D> 7ij(Te, yx) |) = nx? where x? is the tabled 
t+j—ml \kel 

value of x? with m(m + 3) /2 degrees of freedom. 


“ 


17. A Mathematical Theory of Vitamin A Metabolism in Fish (Preliminary 
Report). Normau E. Cooke, Vancouver, B.C. 


Several possible hypotheses for vitamin A metabolism in fish are developed from simple 
postulates. These hypotheses are tested (by least squares method) against experimental 
data in an attempt to deduce the correct mechanism. 


18. The Interactance Hypothesis between Populations. Sruarr C. Dopp, 
University of Washington, Seattle. 


The hypothesis of interacting between human populations, or of demographic gravita- 
tion, is that the number of interactions between two communities (or other groups) tends 
to vary directly with the product of the two populations and their ‘‘specific coefficients” 
and the overall duration and tends to vary inversely with the intervening distance and the 
average duration of an interact. The hypothesis is tested by isolating factors and measur- 
ing their correlation with the amount of interacting in the pairs of a set of N communities. 

This hypothesis is supported by studies of telephoning; news circulating; travel by bus, 
train, or plane; R. R. express; college attandance; intermarrying; etc. Further lists of 
interhuman actions are suggested for investigation. . 

A new corroborating bit of data comes from a poll by the Washington Public Opinion 
Laboratory in a Seattle housing project where negro-white relations threatened violence. 
The tension units of verbal interaction (defined as one anti-negro opinion asserted by one 
white person) were observed to decrease inversely with a power of the distance from a rape 
site. The observed tension correlated with the formulas or curves predicting that tension 
at p = .94 and passed the chi-square test at the one per cent level. The tension is dimen- 
sionally analyzed as a social force and social energy. 








136 ABSTRACTS OF PAPERS 


19. The Empioyment of Marked Members in the Estimation of Animal Popula- 
tions. MILNER B. Scuaerer, U. 8. Fish and Wildlife Service, Honolulu, 
T. Hi. 


The estimation of population numbers by marked members is an important technique in 
fisheries research. The number N of individuals in the population, of which 7 are known 
to be marked, may be estimated from a sample of n of which ¢ are found to be marked. 


7 


: ‘ ’ . we 
Several estimates are available, all of which reduce to N = . when the numbers are all 


large, but more precise formulae should be used when the numbers are not all large. An 
estimate of the variance of NV has been derived by Karl Pearson (Biometrika, Vol. 20 (1928), 
pp. 149-174) on the basis of inverse probabilities. The sampling error may also be measured 
by means of confidence intervals. Formulae have been developed for estimating N from 
repeated samples of the same population, but no very suitable estimates of the sampling 
error are available in this case. For some migratory fishes marked at a point on their 
migration path and sampled later at another point, there exists a correlation between time 
of marking and time of recovery in the subsequent samples. In such case, the total number 
of fish marked or drawnin the subsequent samples cannot in general be regarded as random 
samples of the population. Where numbered tags are employed as marks, so the fish may 
be individually identified both when marked and recovered, a method of estimating N in 
this case also is suggested. 


20. Non-Response and Repeated Call-Backs in Sampling Surveys. Z. W. 
BIRNBAUM AND MONROE G. SIRKEN, University of Washington, Seattle. 


In opinion-polls and other sampling surveys, a response can only be obtained from those 
individuals of a sample who are available for interviewing. Let p.; be the probability that 
an individual chosen at random from the population answers ‘‘yes’’ to a question, p:. that 
an individual is available for interviewing, and pi, that an individual is available and 
answers ‘‘yes.’’ Usually one wishes to estimate the parameter p., , but from a sample it 


. . ; Pu a —_— ; 
is only possible to estimate — = p’ = the probability that an individual answers ‘‘yes”’ 
Pr 


if he is available. Thus the total error in estimating p.. from a sample contains two com- 
ponents: the bias p.; — p’ and the sampling error. In this paper a technique is presented 
in which individuals not available at a call are called upon repeatedly, up to k times. It 
is shown how, for a given upper bound of the total error at a prescribed probability level 
and a given k, it is possible to minimize the cost of the survey by optimizing the relationship 
between the greatest possible bias and the sampling error. 


ee 


° 


(Abstracts of papers presented at the Cleveland Meeting of the Institute on December 
27-30, 1948.) 


21. A Necessary Condition for a Certain Class of Characteristic Functions 
(Preliminary report). EuGENE LuKacs, NOTS, Inyokern, California and 
Our Lady of Cincinnati College, Cincinnati, Ohio. 


t t t\\— pent 
Let g(t) = i(1 ~ t\( _ “) see (: ‘y\ be the reciprocal of a polynomial without 
\ V1 v2 Vn J } 


multiple roots. The following necessary condition is derived which ¢(¢) has to satisfy in 
order to be the Fourier transform (characteristic function) of a distribution. 


nr 


ABSTRACTS OF PAPERS 137 


If ¢(t) is the Fourier transform of a distribution, then 

1) ¢(f) has no real roots. If b + 7a (a+0,b +0) is a root then —b + 7a is also a root. 
That is the roots of ¢(t) are either located on the imaginary axis or are symmetrical to this 
axis. 

2) Ifb + 7a (a + 0) is a root then there exists also at least one root ia so that sign a = 
signa andia!| <S /a| 


As a particular case one obtains the well known fact that (1 + ¢*)~! cannot be a character- 
istic function. 


22. Precision of Estimates from Samples Selected under Marginal Restrictions. 
(Preliminary Report). CuLirrorp J. Matoney, Camp Detrick, Frederick, 
Maryland. 


Formulas are derived for estimates and for their variances computed from samples drawn 
at random subject only to marginal restrictions from populations classified by several 
characters, and estimates are made of the efficiency of such sampling plans compared to 
sampling with complete stratification or sampling completely at random. By means of two 
simple but general theorems it is shown that the variances are independent of the individual 
values of the character being sampled for in the population and in the sample and depend 
only on the first two moments for each cell of the population. It is shown that in the large 
sample approximation a practical scheme for actually drawing such samples can be obtained 
by drawing a sample of size n entirely at random and using the results of Deming and 
Stephan (Annals of Math. Stat., Vol. 11 (1940), p. 427) to adjust the sample marginal totals 
to the specified values. Deficient cells will of course be filled up by additional drawings. 
A measure is given of the relative loss of information in sampling with marginal restrictions 
on the sample cell numbers compared to sampling with complete stratification. If a;; 
represents the population mean in the 7jth cell, r; the population mean in the 7th row and 
c; the population mean in the jth column, and if a;; is of the form a;; = a+r; + c;, then 
marginally restricted sampling is as efficient as sampling with complete stratification. For 
arbitrary a;; a measure of the relative efficiency compared to sampling completely at ran- 
dom is given by the relative degrees of freedom for the sample cell numbers. A compari- 
son with other possible sampling procedures is given. 


23. Properties of Maximum- and Quasi-Maximum Likelihood Estimates of 
Parameters of a System of Linear Stochastic Difference Equations with 
Serially Correlated Disturbances (Preliminary Report). Herman Rustin, 
Cowles Commission, The University of Chicago. 


Let Aust, = u, be a complete system of linear stochastic difference equations, 2; = 
(y: , 21), ys jointly dependent, z; predetermined. Let us suppose u; + Byuwuj1 = v, , where 
the random vectores v; are serially independent and have mean zero. If the vectors v: 
have the same Gaussian distribution, and the system is identified, we can obtain maximum- 
likelihood estimates; if the distributions are not identical Gaussian, quasi-maximum-likeli- 
hood estimates result. The identification problem is a special case of that with independent 
u: and bilinear restrictions on some A%*, , if the restrictions on A*, are linear or bilinear. 
As in that case, we may have multiple identification. However, the special aspects of this 
type of system yield some help in the discussion of the identification problem. We also 
observe that if the system is identified, we obtain consistency and asymptotic normality 
of the estimates under the same conditions as with serially independent w’s for Au:z . 


24. The Computation of Maximum Likelihood Estimates of Parameters of a 
System of Linear Stochastic Difference Equations with Serially Correlated 








138 ABSTRACTS OF PAPERS 


Disturbances. HrrMAN CHERNOFF, Cowles Commission, The University 
of Chicago. 


Consider the structural equations A..x, = u; where the vector a: = [y: , 2:], ys are the 
jointly dependent, and z: the predetermined variables and where wu; are serially correlated, 
In particular assume that the disturbances wu satisfy the simple Markoff Process 
u, + Biuwu;,-, = v; where v; is a stationary serially uncorrelated Gaussian Process with zero 
mean. Then we have Aurt; + BeuAuzt;-, = v;. The estimates of B.. and E{v,;v,} can be 
simply expressed in terms of those of A... It is shown that iterative gradient methods of 
maximization require about 2 to 3 times as much work per iteration as in the serially un- 
correlated case. To apply the Newton Method about 8 times as much work per iteration 
is required. The Newton Method uses the secend order terms of the expansion of the log 
of the likelihood in terms of the independent parameters of A., and these can be used to 
obtain estimates of the asymptotic covariance of the estimates. 


25. Test Criteria for Hypotheses of Symmetry and Definiteness of a Regression 
Matrix for Demand Functions. Urram Cuanp, University of North 
Carolina. 


The importance of relations between two sets of variates (e.g. the study of relations of 
the prices to the quantities of several commodities) invariant under linear transformations 
of one set of variates contragredient to those of the other was first pointed out by Hotelling. 
In the study of related demand functions no suitable statistical tests have existed for 
testing the hypotheses of symmetry and negative definiteness of the regression matrix of 
prices on quantities. The test proposed here for the hypothesis of symmetry is exact and 
invariant under all contragredient transformations. A separate test studied for both 
symmetry and negative definiteness satisfies the property of invariance but its distribution 
depends on a nuisance parameter which is the non-zero root of a certain determinantal 
equation. The likelihood ratio criterion under the hypothesis of symmetry leads to a multi- 
lateral matric equation which represents 3p(p+ 1) equations of the third degree in }p(p+1) 
unknown regression coefficients for the p-variate case, and does not admit of a unique 
solution. 


26. The Distribution of Extreme Values in Samples whose Members are Sto- 
chastically Dependent. Brnygamin Epstetn, Wayne University, Detroit. 


In this paper the following problem is considered. To find the distribution of largest 
and smallest values in samples of size n drawn from a random process subject to the follow- 
ing conditions: 

(i) observations 2 , £2, --* , X, awe taken in order from some random process. 

(ii) the random process is such that successive observations x; and %;4; are jointly 
dependent. The joint distribution is described analytically, independently of i, 
by a two-dimensional d.f. 


F(z, y) = Prob (2; < 2, Zia < y), l<2<n=— 14, 


(iii) F(z, y) = Fe(y, 2) 
(iv) Any other pairs of observations (%; , %4;),1<i<n—1,2<j<n-—1, areas- 
sumed to be independent. 
The results in this paper generalize the special situation where all observations are inde- 
pendent. More general cases than those covered by (i)-(iv) are briefly considered. 


ABSTRACTS OF PAPERS 139 


27. On Age-Dependent Stochastic Branching Processes. RicHarD BELLMAN 
AND THEODORE E. Harris, Stanford University and The RAND Corpora- 


tion, Palo Alto and Santa Monica, California. 
An initial particle has a random life length T with e.d.f. G(t). At death it is replaced 
by a random number N of similar particles; P(N = n) = q, . Particles produced have the 
same distributions of life-length and replacement as the original one. 


Let z(t) = number of particles at time ¢, h(s) = ZY qns", F(s,t) = DY P(z(t) = n)s". The 
n=O n=@ 
integral equation F(s, t) = | h{[F(s, t — y)] dG(y) + s{l — G(t)] uniquely determines 
0 
F(s,t). Whensuitable restrictions are put on h(s) and G(t), results of Feller can be applied 
to study the asymptotic behavior of the moments of z(t), which satisfy linear integral 
equations of the convolution type, and further special results on the moments can be 
obtained. The condition =nq, > 1 and certain further restrictions insure that z(t) /e* 
converges in probability as t > ~«, where b is a certain constant. The m.g.f. ¢(s) of the 


2 
limiting distribution satisfies the equation ¢(s) = | h{@(se~)] dG(y). Further restric- 
0 


tions imply that ¢(s) is analytic in a neighborhood of s = 0, and that the corresponding 
distribution is absolutely continuous. 


28. Cuboidal Lattices. G.S. Watson, Institute of Statistics, North Carolina. 


Yates has given two series of partially balanced incomplete block designs, square and 
cubic lattices, which enable the experimenter to test respectively h? and k* varieties in 
blocks of size k. Harshbarger has recently given a series of designs, rectangular lattices, 
which supplement Yates’ square lattices. 

In this paper two series of designs are given called cuboidal lattices, supplementing the 
cubic lattice series. They may be used to test respectively k?(k + 1) and k(k + 1)? varieties 
in blocks of k, when the number of reflications is a multiple of 3. Interblock information 
may be recovered. The first series has a relatively simple analysis and should prove useful. 

This work was sponsored by the Office of Naval Research. 


29. Transformations Induced by Series Approximation of Prior Probability 
Amplitude. ArcuieE BLake, Office of The Surgeon General, U. S. Army. 


Consider a class A of mutually exclusive and exhaustive possible outcomes of a test. 
(We assume A finite; this condition can under suitable conditions be removed at a later 
stage by a limiting process.) For a hypothesis h, let w be the vector whose value, for each 
member a of A, is the square root of the prior probability of a and h jointly. This vector 
is called the probability amplitude; its norm, the scalar product u’u, is proportional to the 
prior probability of h, the constant of proportionality being determined by comparing the 
norms of the u’s for all h. Let the test leave the alternatives of a subclass B of A still 
possible, while ruling out the members of A — B. Represent this test by a vector r having 
the value 1 on Band0 on A — B. Define d on AA as a matrix equal tor on the main diag- 
onal and zero elsewhere. The posterior probability is proportional to the form value 
u’du, the norm of the projection of u on a subspace determined by suppressing the co- 
ordinates of A — B. Consider the transformation u = tv, t being a matrix on AA and v 
avectoron A. Then u’du takes the form v’t’dtv. Denote t’dt by e. If u is approximated 
as a partial sum of the series fv, i.e. by truncating v with a subclass C of A, the truncation 
induced on e is that with the minor on CC. (How much of the prior probability norm is 








140 ABSTRACTS OF PAPERS 


retained with a particular truncation is most easily seen if ¢ is orthogonal, for then the 
transform of u’u is v’v). 

For example, in an agricultural experiment, let A be the composite of P, the class of 
plots, and Y, the class of possible yields on a plot. Then u takes the form of a second 
order tensor or matrix on PY, while d and ¢ are fourth order tensors. For some member 
y of Y, it often happens that some of the initially most probable, numerous, and economic- 
ally consequential hypotheses will be such that for them the values of u(y) are predomi- 
nantly high on some row of plots, low on another row, ete. The transformation u = ty 
on P induces the transformation e = ¢’dt; this is R. A. Fisher’s transformation, performed, 
however, on d instead of on the yields themselves. The truncation of v and e corresponds 
to Fisher’s relegating the higher interactions to error. This calculation may be accom- 
panied by a linear transformation on Y, e.g. in series of orthogonal functions. (Such 
series are not subject to the disadvantage of classical Gram-Charlier series, which are 
expressed in terms of the probability instead of its square root, that their partial sums 
can be at places negative.) 


30. On the Utilization of Marked Specimens in Estimating Populations of 
Flying Insects. Crcit C. Craic, University of Michigan, Ann Arbor. 


The experimenter catches flying insects, say butterflies, marks and immediately releases 
them. It is assumed that all the insects in a segregated area are equally liable to capture 
whether unmarked or marked, even several times, and that the population is stable for this 
period over which a series of captures is made. From the record of insects caught once, 
twice, three times, and so on, the problem is to estimate the total population. Two mathe- 
matical models which seem appropriate are considered and four methods of estimation are 
compared with respect to the large sample variances of the estimates they give. 


31. On a Probability Distribution. Max A. Woopsury, University of Michi- 
gan, Ann Arbor. 


In this paper the probability of x successes in n trials of an exent is computed for the 
case when the probability of success in a given trial depends only on the number of previous 
successes. The solution P(n, x) satisfies the equation of partial differences 


P(n+1,2+1) = (q — qz)P(n, x) + qeiiP(n, x +1) 


in the case wheng = 1. The boundary conditions are obviously P(0,0) = 1 and P(n, z) =0 
forz <Oor>n+ 2. The solution of this equation is obtained by use of a generating fune- 
tion and P(n, x) proves to be the zth term in the expansion of gq” by means of Newton’s 
divided difference formula given the values qt, --- , g%,°::,4q.. Specifically, by setting 
q = 1, one obtains the result 

ss ° 


P(n, £) = popre**Pz-1 =. qi/U(qi — Go) (Qi — Gi) °° (Gi — Gin) (Qi — Qin) ?** (Gi — Ge))- 
In the case p, = po one has the result 


Me os 


P(n, x) - 
' 1 z 
wt. aq q=Q% 


which yields the usual result on simplification. 


32. Distribution-Free Tests of Data from Factorial Experiments. G. W. Brown 
AND A. M. Moop, Iowa State College. 


A device for avoiding the assumption of normality in analysis of variance problems was 


ee 


2g 


ABSTRACTS OF PAPERS 141 


developed by M. Friedman (Am. Stat. Assoc. Jour., Vol. 32 (1937), pp. 675-701) in which the 
values of the observations were replaced by their ranks. 

An alternative approach is presented here in which medians are used to construct certain 
contingency tables, and the various null hypotheses of interest are easily tested by means 
of the ordinary chi-square criterion applied to such tables. These tests: 

(1) Avoid the assumption of normauty. 

(2) Are particularly sensitive to differences in locations of cell distributions but not to 

their shapes. 

(3) Usually require very little arithmetic computation. 

The tests and the relevant distribution theory have been worked out for some of the 
simpler experimental designs. 


33. On Sums of Symmetrically Truncated Normal Random Variables. Frep 
C. ANDREWS AND Z. W. BrrnsBavum, University of Washington, Seattle. 
Let X. be the random variable with the probability density 
fa(X) =Ce“®*!? for |Xi<a,  fa(X)=0 for |X|>a, 


n 
and Iet S&? = z X{?) where X{!) ,--- , XM™ are independent determinations of X.. 
fa 


The problem considered is: for given n, 7 > 0, ¢€ > O, determine a such that 
P(| S\ | > T) = «. The exact solution of this problem would require laborious computa- 
tions. In this paper a method is given for obtaining approximate values of a which are 
“safe” i.e. such that P(| SQ | >T) < «. 


34. On the Foundation of Statistics. Max A. Woopsury, University of 
Michigan, Ann Arbor. 


The results on this paper are part of the author’s University of Michigan dissertation, 
“Probability and Expected Values.’’ The work covered by this paper was sponsored by 
the Office of Naval Research. One may take the notion of an expected value as the basis 
for the theory of Statistics; i.e. a linear functional on a linear space of random variables 
(real valued functions defined over a population). The space is called statistical if it con- 
tains all constant functions and the expected value of such constant functions is just the 
constant and if the expected value of a non-negative function is non-negative. A statistical 
space is called strong if it contains with a random variable also the random variable whose 
values are the absolute values of the given random variable. Every expected value defines 
a probability measure over a quorum of subsets of the population and it is shown that the 
integral of the random variable, if it exists, coincides with the expected value. Further 
it is shown that if the statistical space is strong the integral necessarily exists and also 
that a necessary and sufficient condition that the quorum be a field is that the statistical 
space be strong. 


35. Finitely Additive Probability Functions. Max A. Woopsury, University 
of Michigan, Ann Arbor. 


The results in this paper are part of the work in the author’s University of Michigan 
dissertation, ‘‘ Probability and Expected Values.’”’ The work covered by this paper was 
sponsored by the Office of Naval Research. A quorum is a family of sets that contains 
with each pair of disjoint sets also their union and also the complement of any of its sets. 
Trivially a quorum is required to contain at least one set and hence at least the universe set 
or population and the empty set. An extension of the notion of a finitely additive prob- 
ability measure function to quorums is given and proved to be equivalent to the usual 








142 ABSTRACTS OF PAPERS 


definition in case the quorum is a field of sets. The extension of a quorum of sets relative 
to the probability measure function is investigated using the properties of the inner and 
outer measure. The upper and lower integrals are defined and a condition for the existence 
of the integral is given. When the quorum is a field it is shown that integrability of a 
function implies the existence of the distribution function. This last result is well known 
in the case where the probability measure function is completely additive. 


36. On Inverting a Matrix via the Gram-Schmidt Orthogonalization Process. 
Max A. Woopsury, University of Michigan, Ann Arbor. 


The application of the classical Gram-Schmidt orthogonalization process to the fac- 
torization of a correlation matrix is accomplished by considering the inner product [z, y] = 
E(xy) in the linear space determined by the statistical variables 2) , x2, +++, 2%. In this 
way a representation of the original set of statistical variables in terms of an orthonormal 
set is obtained. (By an orthonormal set we mean a set & , & ,--- , &: such that H(£:¢;) =0 


for i * j and E(é?) = 1.) The matrix of coefficients B = (b;;), where 2; = 2 b.;£; , has 
Fi 


the property that C = BB’ where C = (E(2;z;)) and’ denotes the transpose. Further the 
matrix B is triangular hence B™ is readily computed, from which one obtains at once 
C1 = (B-)’B-, The quantities b;; are readily obtainable by the method of determinants 
(Dwyer and Waugh, Annals of Math. Stat., Vol. 16 (1945), pp. 259-271, ef. pg. 264) formerly 
called the method of multiplication and subtraction with division. 


37. Certain Properties of the Multiparameter Unbiased Estimates. G. R. 
Setu, Iowa State College. 


If o* = (0% ; os 5 Ses 6.) is an unbiased vector estimate of 6 = (6; , 62,--:* , 6) in the 
density function p(xz , 22, °+: ,2n 3 0:1, 02,°°: , 0,) having the smallest concentration 
ellipsoid among the class of unbiased estimates of 6, and further if ¢ is any statistic of q 
components having E(e) = 0 and finite covariance matrix, then ¢ is uncorrelated with Q*. 

If a set of sufficient statistics (JT; , T: ,--- , Tp), p < q, exists for estimating 6, then 
corresponding to any unbiased vector estimate ¢* of 6, there exists an unbiased estimate 
of 6 depending on 7, , T: , --- , Tp alone, where the latter has a concentration ellipsoid 
equal to or contained in that of the former. 

When q = 1, and ¢* has the smallest variance among the class c formed by unbiased esti- 
mates of @ which are functions of 6* having a finite variance, and the set of polynomials 
with respect to the distribution function of ¢* is complete, then ¢* is the only element in 
the classc. For gq > 1, the result holds when the ‘‘ variance”’ is replaced by the ‘‘ concentra- 
tion ellipsoid.” 


38. A Class of Lower Bounds for.the Variance of Point Estimates. DovuGLas 
CHAPMAN, University of California, Berkeley. 

A class of lower bounds for the variance of point estimates is derived by means of the 
calculus of finite differences under very weak restrictions and it is shown that they give 
valid lower bounds for certain parameter estimation problems for which the Cramér-Rao 
formula is invalid. In some cases even when the latter lower bound exists a sharper lower 
bound may be found in the class here defined. On the other hand when it exists, the Cramér- 
Rao lower bound is asymptotically superior to any of this class. 


39. Standard Errors and Tests of Significance for Interpolated Medians. 
CHURCHILL EISENHART AND Miriam L. Yevickx, National Bureau of Stand- 
ards. 


<a 


—_ —- -_ eae aoe 





ABSTRACTS OF PAPERS 143 


If asample of N observations is grouped by a sequence of class intervals with boundaries 
—6,*+* ,%9,%1,%,%1,%2,°** , +, where 2% is the largest boundary point for which 
the observed ‘fraction below’, pz, is less than 3, and 2 is the smallest boundary point for 
which the observed ‘fraction above’, p4 , is less than 3, so that the observed ‘central frac- 
tion’, pc , between x and 2 is positive, then, at least for the case of N large, standard text- 
books take as the median of the grouped data the interpolated median, 


m= %o+ b(1 = Xo) 


where 
b = (4 — pp)/Pc. 


The literature is silent regarding the sampling properties of such medians, and regarding 
tests of significance appropriate to them. Let Ps and Pc be the population fractions 
below zo , and between 2 and 2 , respectively, and let wu and 6 be the population analogs 
of m and b obtained by replacing pg and pc in the above equations by Pg and Pc, respec- 
tively. It is shown that m is asymptotically normally distributed about u so defined with 
asymptotic variance given by 


ee [Pa(l — Pg) — 28PsPc + BPc(l — Pc)] 
where 
Pe ‘ ‘ ; 
Yc = = oi = ordinate of ‘central rectangle’ of ‘population histogram’. The 
classical formula for the variance of a median can be obtained as the limit of the above 
when (2; — 20) ~ 0 with Pg - 3. 

In addition, tests of hypotheses regarding the value of the ‘interpolated median of the 
population’, uw, and regarding the difference, u2 — wu; , of the interpolated medians of two 
populations, are developed (1) by utilizing the above asymptotic results, and (2) by utiliz- 
ing the Neyman-Pearson likelihood-ratio-test approach. 





40. Some Efficient Range-Estimates of Variation. Nitan Norris, Hunter 
College, New York. 


The commonly used sample range (in the sense of the difference between the largest and 
smallest of the variates) is one of an unlimited number of range or difference-measures 
which can be used to scale parent populations. For samples drawn from a Type III uni- 
verse, the maximum-likelihood estimate of dispersion is given by A — G, where A is the 
sample arithmetic mean and G is the sample geometric mean. For samples drawn from a 
Type V universe, a 100% efficient estimate of absolute variation is given by G — H, where 
G is the sample geometric mean and H is the sample harmonic mean. Under certain general 
conditions usually fulfilled, the standard errors of both of these range-measures of absolute 
dispersion may be estimated from expressions obtained by application of the Laplace- 
Liapounoff theorem. The two parametric methods of estimating absolute variation as 
developed in this paper are likely to be most useful when the form of the parent universe 
is known, and it is either too expensive or impossible to obtain samples large enough to 
permit the use of inefficient estimates. An example of such a case is the learning curve 
encountered in the analysis of frequency of occurrence of aircraft accidents by hours of 
flying experience of pilots in training. E. J. G. Pitman, Proc. Camb. Phil. Soc., Vol. 33 
(1937), pp. 217-218, has discussed the sealing of the Type III distribution. The method 
of sealing given by Pitman differs from the method of estimation developed in this paper for 
the Type III universe. 








NEWS AND NOTICES 
Readers are invited to submit to the Secretary of the Institute news items of interest 
Personal Items 


Dr. Franz L. Alt has resigned his position with the Ballistic Research Labora- 
tories at Aberdeen to join the National Bureau of Standards where he is in charge 
of the Computation Laboratory of the ‘ational Applied Mathematics 
Laboratory.” 

Dr. Edward W. Barankin has been promoted to Assistant Professor and Re- 
search Associate at the Statistical Laboratory, University of California, Berkeley, 
California. 

Dr. Stanley Clark has accepted an associate professorship of Education at 
the College of Education, University of Saskatchewan, Saskatoon, Canada. 

Dr. Gerald J. Cox has resigned his position as Research Chemist in the Chemi- 
cal Division of Corn Products Refining Co., Argo, Illinois to accept an appoint- 
ment as Professor of Dental Research in the School of Dentistry of the Uni- 
versity of Pittsburgh. 

Mr. S. Lee Crump has resigned his assistant professorship at Iowa State Col- 
lege to accept a position in the Atomic Energy Project, University of Rochester. 

Dr. John H. Curtiss, Chief of the National Applied Mathematics Laboratories 
of the National Bureau of Standards, has assumed temporary additional duties 
as Acting Chief of the Institute for Numerical Analysis. The Institute for 
Numerical Analysis, located on the U.C.L.A. campus, was established by the 
National Bureau of Standards with the support of the Office of Naval Research 
and the United States Air Force for the two-fold purpose of pursuing mathemati- 
cal research aimed at the development of numerical techniques for the full ex- 
ploitation of the newer large-scale electronic computing machines and for per- 
forming numerical computations basic to the extension of the frontiers of science. 

Mr. Walter T. Federer has resigned his position at the Statistical Laboratory 
at the Iowa State College to accept a position as Professor of Biological Sta- 
tistics in the Department of Plant Breeding at Cornell University. 

Dr. John Gurland, who received his Ph.D. in mathematical statistics from the 
University of California in August, 1948, is now a Benjamin Pierce Instructor in 
Mathematics at Harvard University. 

Dr. Joseph L. Hodges, Jr. has been promoted to Instructor and Research As- 
sociate at the Statistical Laboratory, University of California, Berkeley. 

Dr. Cyril J. Hoyt has resigned his position as Research Associate with the 
Department of Education at the University of Chicago to accept an appointment 
as Associate Director of the Bureau of Educational Research, University of 
Minnesota. 

Dr. Tjalling C. Koopmans has been promoted to Professor of Economics at 

144 


NEWS AND NOTICES 145 


the University of Chicago and also Director of Research of the Cowles Com- 
mission for Research in Economics. 

Dr. Eugene Lukacs, formerly at Our Lady of Cincinnati College, has accepted 
a position as statistician at the United States Naval Ordnance Test Station at 
Inyokern, California. 

Mr. Frank Jones Massey, Jr., who has been in the Department of Mathematics 
at the University of Maryland, has accepted an assistant professorship in the 
Department of Mathematics at the University of Oregon, Eugene, Oregon. 

Miss Judith Moss has resigned her position at the National Bureau of Eco- 
nomic Research and is now with the Port of New York Authority as an Eco- 
nomic Analyst in the Planning Bureau. 

Dr. Richard Otter has accepted an assistant professorship in the Department 
of Mathematics at the University of Notre Dame. 

Dr. Nathan Grier Parke, III has been appointed Research Fellow of the 
Massachusetts General Hospital and Associated Research Director of the Har- 
vard Piatric Study. 

Dr. Joseph A. Pierce is now serving as Chairman of the Division of Natural 
Science and Mathematics at the Texas State University for Negroes, Houston 
4, Texas. 

Dr. Saul B. Sells, former Assistant to the President of the A. B. Frank Co. of 
San Antonio, Texas, has joined the staff of the Department of Psychology of the 
Air University, School of Aviation Medicine, Randolph Field, Texas. 

Dr. Otis A. Pope, who was with the Office of Foreign Agricultural Relations, 
U. 8S. Department of Agriculture, Technical Collaboration Branch, Wash- 
ington, D. C., died September 28th, 1948. 


(RR a 


Special Summer Session in Survey Research Techniques 


The Survey Research Center of the University of Michigan will hold its special 
summer session in Survey Research Techniques from July 18 to August 13, 1949. 

The following courses will be offered: Introduction to Survey Research, Survey 
Research Methods, Sampling Methods in Survey Research (introductory and 
advanced), Mathematics of Sampling, Statistical Methods in Survey Research, 
Techniques of Scaling. 

In addition the introductory courses will be given from June 20 to July 16. 
This will permit students who are attending the full eight-week summer session 
of the University (June 20 to August 13) to register for the introductory courses 
during the first four weeks. 

It is expected that this special session will attract men and women employed 
in market research or other statitical work and university instructors and gradu- 
ate students with a particular interest in this area of social science research. 

All courses are offered for graduate credit and students must be admitted by 








146 NEWS AND NOTICES 


the Graduate School. Inquiries should be addressed to the Survey Research 
Center, University of Michigan, Ann Arbor, Michigan. 


ae ren ne RR 
Summer Courses in Statistics at Michigan 


In addition to the special courses in Survey Research Techniques, the follow- 
ing courses of special interest to students of statistics are among those offered 
by the mathematics department of the University of Michigan in the Summer 
Session, June 20 to August 13: Finite Differences (Fischer), Probability (Cope- 
land), Theory of Statistics I and II (Carver), Significance Tests (Dwyer), Com- 


putational Methods (Dwyer), Theory of Estimation and of Significance Tests 
(Craig) and Seminar (Craig). 


a RR nmr 
The International Congress of Mathematicians 


No summer meeting of the Institute of Mathematical Statistics is planned for 
1950 because of the meeting of the International Congress of Mathematicians 
which will be held in Cambridge, Massachusetts August 30 to September 6, 1950. 
The following statement has been prepared by the organizing committee: 

An International Congress of Mathematicians will be held in Cambridge, Mas- 
sachusetts, in 1950 under the auspices of the American Mathematical Society. 
The Society originally planned to act as host for a Congress in September, 1940, 
which was also scheduled to meet in Cambridge. At the 1936 Congress in Oslo, 
Norway, the invitation for the 1940 Congress was issued by the American dele- 
gation in the name of the American Mathematical Society. Plans for the 1940 
Congress were practically completed when the outbreak of World War II in 
September, 1939, made it necessary for the Society to postpone the Congress to 
a more favorable date. An Emergency Committee was established to carry on in 
the interim and, on recommendation of this Committee, the Council of the 
Society voted to hold the Congress in 1950. 

The 1950 Congress will be the third International Congress of Mathematicians 
to be held on the continent of North America. The first was held at North- 
western University in 1893 and the second at the University of Toronto in 1924. 
International Congresses were held at intervals of approximately four years, 
except when war intervened, until 1936. There has been no international gath- 
ering of mathematicians since that time and it is the sincere hope of the Or- 
ganizing Committee that the gathering in 1950 will be a truly international one, 
that the American mathematicians will attend in large numbers, and that all 
other countries will be well represented. The Council of the American Mathe- 
matical Society has voted unanimously to hold a Congress which will be open to 
mathematicians of all national and geographical groups. 

Time and Place. The dates for the Congress have been fixed as August 30- 
September 6, 1950. Harvard University will be the principal host institution. 
A number of other institutions in metropolitan Boston will join in the entertain- 
ment of Congress visitors by arranging special features on their campuses. 


a tyes tt Of LC 


_— — | a 


NEWS AND NOTICES 147 


Type of Congress. In recent years mathematicians have been much impressed 
by the success of the conference method for presenting recent research in fields 
where vigorous advances have just been made or are in progress. In view of the 
success Of mathematical conferences on special topics which have been held in 
Russia, France and Switzerland and, more recently, at the Princeton Bicentennial 
Celebration, the 1950 Congress will include Conferences in several fields. For 
the 1940 Congress, Conferences in four fields had been planned. The number of 
Conferences was thus restricted lest the introduction of a promising and novel 
feature result in failure through the dissipation of interest and energy. 

Following the established custom, the Organizing Committee plans to have a 
number of invited hour addresses by outstanding mathematicians. In addition, 
sectional meetings for the presentation of contributed papers not included in 
Conference programs will be held in the following fields: I, Algebra and Theory 
of Numbers; II, Analysis; III, Geometry and Topology; IV, Probability and 
Statistics, Actuarial Science, Economics; V, Mathematical Physics and Applied 
Mathematics; VI, Logic and Philosophy, VII, History and Education. 

The official languages of the 1950 Congress will be English, French, German, 
Italian, and Russian. 

Organization. The plans for the Congress are under the supervision of an 
Organizing Committee which was elected by the Council of the American Mathe- 
matical Society in February, 1948. The Chairman is Professor Garrett Birkhoff 
of Harvard University and the Vice Chairman is Professor W. T. Martin of 
Massachusetts Institute of Technology. Other members of the committee are: 
Professors J. L. Doob, G. C. Evans, J. R. Kline, Solomon Lefschetz, Saunders 
MacLane, Dean R. G. D. Richardson, Professors Oswald Veblen, J. L. Walsh, 
D. V. Widder, Norbert Wiener, and R. L. Wilder. 

Many of the subventions promised for the 1940 Congress are still available. 
A Financial Committee under the chairmanship of Professor John von Neumann 
is endeavoring to secure additional funds. Besides support from Harvard Uni- 
versity and Massachusetts Institute of Technology, generous subventions have 
been subscribed for the Congress by the Carnegie Corporation, the Institute for 
Advanced Study, the National Research Council, and the Rockefeller 
Foundation. 

An Editorial Committee under the chairmanship of Professor Salomon Bochner 
will assume responsibility for the publication of the Proceedings of the Congress. 

Professor J. R. Kline of the University of Pennsylvania has been named Secre- 
tary of the Congress and Dr. R. P. Boas, Executive Editor of Mathematical 
Reviews, has been designated Associate Secretary. 

Entertainment. Harvard University has offered the use of its dormitories and 
dining rooms for mathematicians and their guests for the period of the Congress. 
The Organizing Committee hopes that it will be possible to furnish room and 
board without charge to all mathematicians from outside continental North 
America who are members of the Congress. Congress membership fees and rates 
for room and board will be announced well in advance of the opening of the 
Congress. 








148 NEWS AND NOTICES 


The Entertainment Committee, of which Professor L. H. Loomis of Harvard 
University is Chairman, is planning many interesting features, including a re- 
ception, garden party, symphony concert, and banquet. It is hoped that Amer- 
can mathematicians will be able to assist in the entertainment by putting their 
automobiles at the disposal of the Entertainment Committee for trips to be 
made out of Cambridge. 

Every effort will be made to facilitate the travel at reasonable cost of foreign 
participants while in the United States. Previous to the Congress, opportunity 
will be given them to see New York City under the guidance of some mathe- 
maticians. 

Information. Detailed information will be sent in due course to individual 
members of the American Mathematical Society and to foreign mathematical 
societies and academies. Others interested in receiving information may file 
their names in the office of the Society, and such persons will receive from time 
to time information regarding the program and arrangements. 

Communications should be addressed to the American Mathematical Society, 
531 West 116th Street, New York City 27, U.S. A. 


a a eRe a 


New Members 


The following persons have been elected to membership in the Institute 
(August 16, 1948 to November 30, 1948) 


Alman, John E., M.A. (Claremont Colleges) instructor in Mathematics, College of Liberal 
Arts, Boston University, 216 Gardner Road, Brookline 46, Massachusetts. 

Andrian, Jane F., M.S. (Western Reserve Univ.) Graduate student at University of Cali- 
fornia, 1222 C. Ashby Avenue, Berkeley 2, California. 

Arbuckle, Richard A., B.S. (Baldwin Wallace College) Research-Industrial Fellow at 
Purdue University, F.Ph.A. 530-3 Airport Road, Lafayette, Indiana. 

Barankin, Edward W., Ph.D. (Univ. of Calif.) Assistant Professor of Mathematics and 
Research Associate in Statistical Laboratory, University of California, Berkeley, 
California. 

Blum, Julius R., Student in mathematical statistics at the University of California, 1957 
Acton Street, Berkeley 2, California. 

Bronfenbrenner, Mrs. Jean, M.A. (Univ. of Chicago) Research Assistant, Cowles Com- 
mission, University of Chicago, Chicago 37, Illinois. 

Burns, Loren V., B.S. (Washburn College, Topeka, Kansas) Technical Director, MFA 
Milling Co., Box 1585 S.S.S., Springfield, Missouri. 

Clement, Edwin G., M.B.A. (Univ. of Chicago) Captain, Chief of Management Control 
Branch, Headquarters, Strategic Air Command, Andrews Air Force Base, Washing- 
ton 20, D. C. 

Cramer, George F., Ph.D. (Univ. of Missouri) Mathematician, U. 8. Navy Department, 
Washington, D. C., 112 Quincy Street, Chevy Chase 15, Maryland. 

Degan, James W., A.B. (Univ. of Chicago) Research Assistant, Psychometric Laboratory, 
University of Chicago, 1128 East 61st Street, Chicago 37, Illinois. 

Dodd, Stuart C., Ph.D. (Princeton) Research Professor of Sociology and Director of Public 
Opinion Laboratory, 4725—45th Avenue, N.E., Seattle 5, Washington. 

Donnelly, Tom G., M.A. (Queen’s Univ.) Graduate student at the University of North 
Carolina, Room 213 ‘‘B’’, Chapel Hill, North Carolina. 





Me —_—_— 


\e 





NEWS AND NOTICES 149 


Edwards-Davies, Harold D., Special Lecturer, Department of Mathematics, Dalhousie 
University, 67 Seymour Street, Halifax, N.S., Canada. 

Ellner, Henry, Ch.E. (College of City of New York) Statistician (Physical Sciences) 
1-C Oak Grove Drive, Baltimore 20, Maryland. 

Feigenbaum, Armand V., M.S. (Mass. Institute of Tech.) General Electric Company, 
Room 257, Building 23, Schenectady, New York. 

Festinger, Leon, Ph.D. (Univ. of Iowa) Assistant Professor of Psychology, Research Cen- 
ter for Group Dynamics, University of Michigan, Ann Arbor, Michigan. 

Frame, James S., Ph.D. (Harvard) Professor and Head of Department of Mathematics, 
Michigan State College, Lansing, Michigan. 

French, Benjamin J., M.Ed. (Univ. of New Hampshire) Examiner, Educational Testing 
Service, Matthews Road, Keene, New Hampshire. 

Gaffey, William R., A.B. (Univ. of Calif.) Research Assistant, University of California, 
2306 Grant Street, Berkeley 4, California. 

Goodman, Leo A., A.B. (Syracuse University) Research Assistant in Mathematical Sta- 
tistics and Graduate student at Princeton University, Fine Hall, Princeton Univer- 
sity, Princeton, New Jersey. 

Hader, Robert J., Ph.D. (North Carolina State College) Instructor and Research As- 
sistant, Institute of Statistics, North Carolina State College, Raleigh, North Carolina. 

Haley, Kenneth D., M.S. (Stanford Univ.) Assistant Professor of Mathematics, Acadia 
University, Wolfville, Novia Scotia, Canada. 

Kahn, Louis B., M.S. (Univ. of Wisconsin) Research Associate, University of Wisconsin, 
Box 16-F, Badger, Wisconsin. 

Katz, Irving, B.S. (College of City of New York) Statistician, Strategic Air Command, 
379—37 Place, S.E., Washington 19, D. C. 

Kientzle, Mary J., Ph.D. (Univ. of Ill.) Assistant Professor of Psychology, Department of 
Psychology, Washington State College, Pullman, Washington. 

Koditschek, Paul, Ll. D. (Univ. of Vienna) Research Associate, Scientific Research Serv- 
ice, Columbia University, 319 W. 13th Street, New York 14, New York. 

Levin, Howard S., S.B. (Univ. of Chicago) Electronic Engineer, Glenn L. Martin Co., 
532 Addison Street, Chicago 13, Illinois. 

Levine, George J., B.S. (Brooklyn College) Actuarial Mathematician, 5109—1st Street, 
North, Arlington, Virginia. 

Liverman, J. G., B.A. (Cantab) Civil Servant, Ministry of Fuel and Power, 21 Ascot Court, 
Grove End Road, London, N.W. 8, England. 

Loeve, Michel, Ph.D. (Sorbonne, Paris) Professor and Research Associate in Statistical 
Laboratory, Durant Hall, University of California, Berkeley, California. 

Loo, Ching-Tsu, Ph.D. (Univ. of Chicago) Research Associate, Statistical Laboratory, 
University of California, Berkeley, California. 

Lubin, Ardie, B.S. (Univ. of Chicago) Statistician, Psychology Department, Maudsley 
Hospital, Denmark Hill, 8.E. 5, London, England. 

Moses, Lincoln E., A.B. (Standord Univ.) 7 Perry Lane, Menlo Park, California. 

Mourier, Edith, Licencie-és-sciences (Univ. of Caen, France) Teaching Assistant, Statisti- 
cal Laboratory, University of California, Berkeley, California. 

Osborne, Ernest L., L.L.B. (LaSalle Univ.) Economic Analyst, Department of the Army, 
Chancery Apartments, 3130 Wisconsin Avenue, N.W., Washington 16, D. C. 

Pabst, William R. Jr., Ph.D. (Columbia Univ.) Quality Control Division, Bureau of Ord- 
nance, Navy Department, 3420 Quebec Street, N.W., Washington 16, D.C. 

Plackett, Robin L., M.A. (Cambridge, England) Lecturer in Mathematical Statistics, 
Department of Applied Mathematics, The University, Liverpool 3, England. 

Proschan, Frank, M.A. (George Washington Univ.) Research Analyst, 1627 R. St., V.W., 
Washington 9, D.C. 








150 REVISION OF BY-LAWS 


Rau, A. Ananthapadmanabha, M.S. (Iowa State College) Statistician and Agricultural 
Meteorologist, Department of Agriculture, Bangalore, Mysore State, India. 

Rees, Mina, Ph.D. (Univ. of Chicago) Head, Mathematics Branch, Office of Naval Re- 
search, R2719, T-3 Building, Washington 25, D. C. 

Roberts, Spencer W. Jr., M.S. (Univ. of Michigan) Research Associate, University of 
Michigan Department of Engineering Research, 306 Thompson Street, Ann Arbor, 
Michigan. 

Sarma, S. C., M.Sc. (Calcutta Univ.) Graduate student in mathematical statistics at 
Columbia University, 1120 John Jay Hall, Columbia University, New York 27, 
New York. 

Schneiderman, Marvin A., B.S. (College of City of New York) Statistician, Biological, Na- 
tional Institute of Health, T-6, 2215, Bethesda, Maryland. 

Schull, William J., Ph.D. (Ohio State Univ.) Student at Ohic State University, Depart- 
ment of Zoology, Ohio State University, Columbia 10, Ohio. 

Schweid, Samuel, B.S.S. (College of City of New York) Statistician, Industry Division, 
Bureau of the Census, 1110 Monroe Street, N.W., Washington 10, D. C. 

Wallace, David L., B.S. (Carnegie Institute of Tech.) Graduate Student and Teaching 
Assistant in Mathematics, Carnegie Institute of Technology, 123 Lawrence Avenue, 
Homestead Park, Pennsylvania. 

Williams, Evan James, B.C. (Univ. of Tasmania) Research Officer, Section of Mathe- 
matical Statistics, Division of Forest Products, C.S.I.R., P.O. Box 18, South Mel- 
bourne, S.C. 4, Australia. 

Zavrotsky, Andres, Head of the Statistical Department of the Venezuela Office for Social 
Insurance, Mercedes a Luneta 39, Caracas. 


Correction of New Members in June, 1948 issue: 

Loizelier, Enrique Blanco, should be written as follows: . 

Blanco Loizelier, Enrique. (Ph.D.) Professor of Statistics, Economics Faculty, Madrid 
University, Spain, Nervion No. 4, Madrid, Spain. 


CR eR Re ore 
ELECTION OF OFFICERS AND COUNCIL AND REVISION OF BY-LAWS 


At the membership meeting held at Cleveland on December 28, the following 
officers and members of the Council were elected: 


President: J. Neyman 
President-Elect: J. L. Doob 
Council: W. G. Cochran 
C. Eisenhart 
nee ee H. Hotelling 
A. Wald 
W. Feller 
2-year term P. G. Hoel 
H. Scheffé 


J. Wolfowitz 
Gertrude Cox 
M. A. Girshick 

| J. W. Tukey 

\J. von Neumann 


l-year term 


ene 





REPORT ON SEATTLE MEETING 151 


The By-Laws were also revised and further action was taken. More detailed 
accounts of this meeting will be sent directly to the members. 
Pau. 8. DwYER 
Secretary 
LL 


REPORT ON THE SEATTLE MEETING OF THE INSTITUTE 


The thirty-sixth meeting and fourth Regional West Coast meeting of the 
Institute of Mathematical Statistics was held in Seattle, Washington, November 
26-27, 1948. The sessions of November 27, 1948 were held jointly with the 
Biometric Society (Western N. A. Region). The meeting was attended by 91 
persons, including the following 22 members of the Institute: 


F. C. Andrews, E. W. Barankin, Z. W. Birnbaum, A. H. Bowker, D. G. Chapman, R. C. 
Davis, W. J. Dixon, E. Fay, M. A. Girshick, P. Horst, H. M. Hughes, J.C. R. Li, F. Massey, 
J. Neyman, E. Paulson, Elizabeth L. Scott, Esther Seiden, M. Sobel, Z. Szatrowski, J. R. 
Vatnsdal, J. E. Walsh and Zivia S. Wurtele. 


At the morning session on November 26, Professor R. M. Winger of the Uni- 
versity of Washington as chairman welcomed those attending the meetings, and 
the following program of contributed papers was presented : 


1. Estimation of the Variance of the Bivariate Normal Distribution. 
Harry M. Hughes, University of California. 
2. Derivation of a Broad Class of Consistent Estimates. 
R. C. Davis, NOTS, Inyokern, California. 
3. Locally Best Unbiased Estimates. 
Edward W. Barankin, University of California. 
4. Some Problems Related to the Distribution of a Random Number of Random Variables. 
Edward Paulson, University of Washington. 
5. Asymptotic Expansions for the Distribution of Certain Likelihood Ratio Statistics. 
Albert H. Bowker, Stanford University. 
6. On a Problem of Confounding in Symmetrical Factorial Design. 
Esther Seiden, University of California. 
. Some Bounded Significance Level Tests of Whether the Largest Observations of a Set are 
Too Small. 
John E. Walsh, Project RAND, Douglas Aircraft Corp., Santa Monica, Calif. 


The afternoon session of November 26, under the chairmanship of Professor 
J. Neyman of the University of California at Berkeley, had the following 
program : 


1. Invited paper: 
Multiple Decision Functions. 
M. A. Girshick, Stanford University. 
Contributed papers: 
2. Determination of Optimal Test Length to Maximize the Multiple Correlation Coefficient. 
Paul Horst, University of Washington. 
3. Some Numerical Comparisons of a Non-Parametric Test with Other Tests. 
F. J. Massey, University of Oregon. 








152 


REPORT ON CLEVELAND MEETING 


. On the Deviation of Extreme Values. 


W. J. Dixon, University of Oregon. 


. The Optimum Size of Interval for Making Measurements of a Rocket’s Angular Velocity. 


Edward A. Fay, University of California. 


. Stationary Time Series Analysis and Common Stock Price Forecasting. 


Zenon Szatrowski, University of Oregon. 


At the morning session of November 27, with Professor W. F. Thompson of the 
University of Washington as chairman, the program consisted of the following 
papers: 


a 


bo 


Invited paper: 

On the Place of Statistics in Fishery Biology. 

Willis S. Rich, Stanford University and U.S. Fish and Wildlife Service. 
Contributed papers: 


. Distribution of the Number of Schools of Fish Caught per Boat. 


J. Neyman, University of California. 


. Some Problems in Fishery Research to which Statistical Methods are Applicable. 


Ralph Silliman, U. 8. Fish and Wildlife Service, Seattle, Washington. 


. The Application of the Hypergeometric Distribution to Problems of Estimating and Com- 


paring Zoological Population Sizes. 
Douglas Chapman, University of California. 


. Extension to Multivariate Case of Neyman’s Smooth Test. 


Elizabeth L. Scott, University of California. 


. A Mathematical Theory of Vitamin A Metabolism in Fish. 


Norman E. Cooke, Pacific Fisheries Experimental Station, Vancouver, B.C. 


The afternoon session of November 27 was held under the chairmanship of 
Professor F. W. Weymouth of Stanford University, with the following program: 


ne 


i) 


Invited paper: 

Statistical Problem of Enumeration of Fish Eggs in the Sea. 
Osear E. Sette, U. S. Fish and Wildlife Service, San Francisco. 
Contributed papers: 


. The Interactance Hypothesis. 


Stuart C. Dodd, University of Washington. 


. The Employment of Marked Members in Estimation of Animal Populations. 


Milner E. Schaefer, Stanford University. 


. Non-Response and Repeated Call-Backs in Opinion Polls. 


Z. W. Birnbaum, University of Washington. 


. Statistical Problems Relating to Fisheries. 


J. L. Hart, Pacific Biological Station, Nanaimo, B. C. 


On November 26, at 6:30 o’clock there was a dinner for members and guests 
at the Edmond Meany Hotel. 


Z. W. BrIRNBAUM 


Ce nm a 


REPORT ON THE CLEVELAND MEETING OF THE INSTITUTE 


The Eleventh Annual Meeting of the Institute of Mathematical Statistics was 
held at the Statler Hotel, Cleveland, Ohio, on December 27-30, 1948. The 


REPORT ON CLEVELAND MEETING 153 


meeting was held in conjunction with the Annual Meeting of the American 
Statistical Association. The following 176 members of the Institute were in 
attendance: 


P. H. Anderson, R.'L. Anderson, L. W. Anderson, Max Astrachan, G. J. Auner, T. A. 
Bancroft, B. Geoffrey, Z. W. Birnbaum, Archie Blake, E. E. Blanche, C. I. Bliss, Dorothy 
S. Brady, A. E. Brandt, G. W. Brown, T. H. Brown, M. A. Brumbaugh, P. T. Bruybre, R. W. 
Burgess, I.W. Burr, J. M. Cameron, A. G. Carlton, Harry Carver, F. R. Cella, Uttam Chand, 
R. A. Chapman, Edmund Churchill, Herman Chernoff, W. G. Cochran, Jerome Cornfield, 
J. H. Cover, Gertrude M. Cox, C. C. Craig, 8. L. Crump, J. H. Curtiss, D. A. Darling, W. 
L. Deemer, D. B. DeLury, W. E. Deming, Philip Desind, H. F. Dorn, C. W. Dunnett, P.S. 
Dwyer, Churchill Eisenhart, Benjamin Epstein, C. D. Ferris, Leon Festinger, C. H. Fischer, 
J.C. Flanagan, M. M. Flood, L. R. Frankel, D. A. S. Fraser, H. A. Freeman, Milton Fried- 
man, H. C. Fryer, E. F. Gardner, R.S. Gardner, H. H. Germond, William Gomberg, E. L. 
Green, 8. W. Greenhouse, J. Gurland, R. J. Hader, K. W. Halbert, H. J. Hand, M. H. Han- 
sen, T. E. Harris, Boyd Harshbarger, P. M. Houser, J. F. Hofmann, Harold Hotelling, A.S. 
Householder, E. E. Houseman, Helen M. Humes, C. C. Hurd, C. M. Jaeger, R. J. Jessen, 
H. L. Jones, Irving Katz, Leo Katz, Harriet J. Kelly, O. Kempthorne, A. W. Kimball, Jr., 
A. J. King, Leslie Kish, L. A. Knowler, Lila F. Knudsen, C. F. Kossack, O. E. Lancaster, 
Marvin Lavin, 8S. B. Littauer, Irving Lorge, F. W. Lott, Jr., Eugene Lukacs, P. J. McCar- 
thy, C. J. Maloney, John Mandel, Nathan Mantel, H. B. Mann, E. S. Marks, Margaret 
Merrell, Helen Michaels, E. B. Mode, A. M. Mood, Nathan Morrison, Dorothy J. Morrow, 
J. W. Morse, J. E. Morton, Jack Moshman, Frederick Mosteller, B. D. Mudgett, Hugo 
Muench, M. R. Neifeld, R. H. Noel, G. E. Noether, J. I. Northam, H. W. Norton, J. A. 
Norton, Jr., E. G. Olds, P. S. Olmstead, Bernard Ostle, A. E. Paull, Paul Peach, M. P. 
Peisakoff, E. W. Pike, E. J. G. Pitman, R. A. Porter, J. A. Rafferty, L. J. Reed, Olav Reier- 
sol, William Reitz, F. D. Rigby, A. C. Rosander, Herman Rubin, Erik Ruist, P. J. Rulon, 
Max Sasuly, F. E. Satterthwaite, L. J. Savage, Mary Ann Savas, Marvin Schniederman, 
Elizabeth Scott, G. R. Seth, Jack Sherman, S. 8. Shrikhande, C. R. Simms, J. H. Smith, 
G. W. Snedecor, Mortimer Spiegelman, B. R. Stauber, F. F. Stephan, Joseph Steinberg, 
J. V. Sturtevant, B. J. Tepping, W. R. Thompson, J. W. Tukey, Jan Vchytil, W. R. Van 
Voorhis, D. F. Votaw, Jr., F. M. Wadley, Helen M. Walker, D. L. Wallace, W. A. Wallis, 
G.S. Watson, Leonel Weiss, Samuel Weiss, E. L. Welker, M. E. Wescott, Phillips Whidder, 
D. R. Whitney, 8.8. Wilks, C. P. Winsor, Gerald Winston, M. A. Woodbury, T. D. Woolsey, 
Holbrook Working, W. J. Youden. 


The first session, a joint session with the American Statistical Association, 
was held at 2:00 P.M. on Monday, December 27, at which time a paper entitled 
Statistical Concepts in an Infinite Number of Dimensions was presented by Pro- 
fessor David H. Blackwell of Howard University. Professor E. J. G. Pitman 
of the University of Tasmania was chairman. 

The second session of the opening day was devoted to contributed papers in 
mathematical statistics, and was held at 4:00 P.M. in conjunction with the 
American ‘Statistical Association. Professor W. R. Van Voorhis of Fenn College 
was chairman. The following papers were presented: 

1. A Necessary Condition for a Certain Class of Characteristic Functions. Preliminary 

report. Eugene Lukacs, NOTS, Inyokern, California and Our Lady of Cincinnati 

College, Cincinnati, Ohio. 

2. Precision of Estimates from Samples Selected under Marginal Restrictions. Preliminary 
report. Clifford J. Maloney, Research and Development Department, Camp Det- 
rick, Frederick, Maryland. 








154 


- 


oO. 


REPORT ON CLEVELAND MEETING 


. Properties of Maximum and Quasi-Maximum Likelihood Estimates of Parameters of a 


System of Linear Stochastic Difference Equations with Serially Correlated Disturbances. 
Preliminary report. Herman Rubin, Cowles Commission, University of Chicago. 


. The Computation of Maximum Likelihood Estimates of Parameters of a System of Linear 


Stochastic Difference Equations with Serially Correlated Disturbances. 

Herman Chernoff, Cowles Commission, University of Chicago. 

Test Criteria for Hypotheses of Symmetry and Definiteness of a Regression Matriz for 
Demand Functions. 

Uttam Chand, University of North Carolina. 


. The Distribution of Extreme Values in Samples whose Members are Stochastically De- 


pendent. 
Benjamin Epstein, Wayne University. 


A session on Teaching Statistical Quality Control was held on Monday evening, 
December 27, jointly with the Ohio Section of the American Society for Quality 
Control and Section on Training of Statisticians of the American Statistical 
Association. Professor Samuel S. Wilks of Princeton University presided at the 
session. The following two papers were presented. : 


i 


2. 


The 


Teaching Statistical Quality Control for Town and Gown. 
Lloyd A. Knowler, State University of Iowa. 
Insiructional Aids for Statistical Quality Control. 
Edwin G. Olds, Carnegie Institute of Technology. 


session concluded with discussion by Professor Irving W. Burr of Purdue 


University, and Professor Theodore H. Brown of Harvard University. 


A 


can 


session on Review of Statistical Methodology was held jointly with the Ameri- 
Statistical Association at 2:00 P.M., December 28. Professor Frederick 


Mosteller of Harvard University presided. The following papers were presented : 


ZL. 


2 


3. 


Surveys and Sampling. 
Philip J. McCarthy, Cornell University. 


. Industrial Applications. 


Paul S. Olmstead, Bell Telephone Laboratories. 
Biology, Physical Sciences and Experimental Design. 
W. J. Youden, National Bureau of Standards. 


At 4:00 P.M. on Tuesday, December 28, Professor H. C. Fryer of Kansas 
State College presided at a joint session with the Biometric Society and Bio- 
metrics Section of the American Statistical Association. Papers presented were: 


hs 


2 


4. 


Evaluation of Field Insecticides from Count of Survivors. 
C. I. Bliss and Neely Turner, Connecticut Agricultural Experiment Station. 


. Curved Dosage-Response Curves. 


Oscar Kempthorne, Iowa State College. 


. Statistical Variations in Contents of Dry-Filled Ampuls in Current Pharmaceutical 


Practice. 

M. W. Green, American Pharmaceutical Association, and Lila F. Knudsen, Food and 
Drug Administration. 

A Practical Method for Determining the Mean and Standard Deviation of Truncated 
Normal Distributions. 

J. Ipsen, Yale University. 


on 


oa tod 


REPORT ON CLEVELAND MEETING 155 


The session was concluded with discussion by D. B. DeLury, Ontario Research 
Foundation ; Lloyd Miller, Sterling-Winthrop Research Institute; C. Eisenhart, 
National Bureau of Standards; J. L. Northam, Kansas State College. 

On Wednesday, December 29, at 2:00 P.M., Dr. W. Edwards Deming presided 
at a session on Effects of Error in the Independent Variate in Regression Problems. 
This meeting was held in conjunction with the Biometric Society and Biometric 
Section of the American Statistical Association. Papers presented were: 


1. Are There Two Regressions? 
Joseph Berkson, Mayo Clinic. 
2. Present Status of the Theory. 
Jerzy Neyman, University of California. 
3. The Identifiability of a Linear Relationship Between Variables which are Subject to 
Error. 
Olav Reiersol, Purdue University. 


These papers were followed by discussion by Professor Churchill Eisenhart, Na- 
tional Bureau of Standards, Elizabeth L. Scott, University of California, and 
C. P. Winsor, Johns Hopkins University. 

Professor Boyd Harshbarger, of the Virginia Polytechnic Institute, presided 
at the Wednesday afternoon session on contributed papers in mathematical sta- 
tistics. Papers presented were: 


1. On Age-Dependent Stochastic Branching Processes. 
Richard Bellman and Theodore E. Harris, Stanford University, Palo Alto, Cali- 
fornia and the Rand Corporation, Santa Monica, California. 

2. Cuboidal Lattices. 
G.S. Watson, Institute of Statistics, University of North Carolina. 

3. Transformations Induced by Series Approximation of Prior Probability Amplitude. 
Archie Blake, Office of the Surgeon General, U.S. Army. 

4. On the Utilization of Market Specimens in Estimating Populations of Flying Insects. 
Cecil C. Craig, University of Michigan. 

5. On a Probability Distribution. 
Max A. Woodbury, University of Michigan. 

6. Distribution-Free Tests of Data from Factorial Experiments. 
G. W. Brown and A. M. Mood, Iowa State College. 

7. On Sums of Symmetrically Truncated Normal Random Variables. 
Fred C. Andrews and Z. W. Birnbaum, University of Washington. 

8. On the Foundation of Statistics. 
(By title). Max A. Woodbury, University of Michigan. 

9. Finitely Additive Probability Functions. 
(By title). Max A. Woodbury, University of Michigan. 

10. On Inverting a Matriz via the Gram-Schmidt Orthogonalization Process. 
(By title). Max A. Woodbury, University of Michigan. 

11. Certain Properties of the Multiparameter Unbiased Estimates. Preliminary report. 
(By title). Gobind R. Seth, Iowa State College. 

12. A Class of Lower Bounds for the Variance of Point Estimates. 
(By title). Douglas Chapman, University of California. 

13. Standard Errors and Tests of Significance for Interpolated Medians. 
(By title). Churchill Eisenhart and Miriam L. Yevick, National Bureau of Stand- 
ards. 








156 REPORT OF THE PRESIDENT 


A symposium on Randomness and its Testing occupied the 4:00 P.M. session 
on Wednesday. Dr. Walter A. Shewhart of the Bell Telephone Laboratories 
presided and the following papers were presented: 


1. Survey of Available Tests for Randomness. 
W. Allen Wallis, University of Chicago. 
. Power Functions of Tests for Randomness. 
H. B. Mann, Ohio State University. 
3. Power Functions of Non-Parametric Tests. 
Ransom Whitney, Ohio State University. 


2 


Discussion was led by Bernice Brown, The Rand Corporation ; Paul 8S. Olmstead, 
Bell Telephone Laboratories; E. J. G. Pitman, University of Tasmania. 

The morning session on Thursday, December 30, was a joint session with the 
American Statistical Association, with Professor Jerzy Neyman of the University 
of California presiding. The following two papers were presented upon invita- 
tion of the Institute: 


1. Estimating Linear Restrictions on Regression Coefficients for Multivariate Norma! 
Distributions. 


T. W. Anderson, Columbia University. 


2. Some Aspects of the Theory of Testing Composite Hypotheses. 
E. L. Lehmann, University of California. 


The Business Meeting was held at 10:00 A.M. on Tuesday, December 28. 
Dr. Churchill Eisenhart presided. A report jot this meeting is found elsewhere 
in this issue. 

W. R. Van Vooruis 
Assistant Secretary 


(ee eI ne 


REPORT OF THE PRESIDENT OF THE INSTITUTE FOR 1948 


The last few years have seen a considerable growth of the Institute. The 
upward trend has continued throughout 1948. The Institute has acquired 126 
new members during the year, but this gain is to be balanced against losses due 
to resignation and suspension for non-payment of dues. The Institute starts 
the year 1949 with a membership of about 1,100 as against the membership of 
1,037 at the beginning of 1948. While the net gain is still substantial, it is not 
quite as much as hoped for, and this may serve as an incentive for an increased 
membership drive in 1949. The constantly increasing interest and research 
activities in statistical theory and methodology are well reflected in our meetings 
and the publications appearing in the Annals. 

Meetings. The growth of the Institute in the past few years has brought 
about a considerable increase in its various activities. This manifested itself 
particularly in the extensive and rich programs of the meetings held during the 
year 1948. In addition to the usual invited addresses and contributed papers, 
the programs included a considerable number of symposia on various important 


—-_ 


S 


REPORT OF THE PRESIDENT 157 


subjects such as the theory of games (Berkeley, June; Madison, September), 
stochastic difference equations (Madison, September), scales of measurement 
(New York, April), sampling for industrial use (Berkeley, June), ete. The 
eleventh summer meeting was held in conjunction with the meetings of the 
American Mathematical Society and the Econometric Society (Madison, Septem- 
ber). The eleventh annual meeting (Cleveland, December) was held in con- 
junction with the American Statistical Association, Econometric Society and 
Biometric Society. There were also three regional meetings: New York (April), 
Berkeley (June) and Seattle (November). The Berkeley meeting was held 
in conjunction with the Pacific Division of the American Association for the 
Advancement of Science and some of the sessions of the Seattle meeting were 
sponsored jointly with the Biometric Society. 

To facilitate the organization of meetings and arrangements of programs, 
instead of a single program committee there were three program committees 
appointed, one for Eastern, one for Mid-Western and one for Far-Western meet- 
ings. ‘These committees consisted of the following members. Eastern Com- 
mittee: W. G. Cochran, C. Eisenhart (Chairman), F. Mosteller, and J. Wolfo- 
witz; Mid-Western Committee: C. C. Craig, H. B. Mann, and A. M. Mood 
(Chairman); Far Western Committee: Z. W. Birnbaum, M. A. Girshick, P. G. 
Hoel, and J. Neyman (Chairman). To coordinate the work of these three pro- 
gram committees, a coordinating committee was appointed consisting of J. W. 
Tukey (Chairman) and the three chairman of the three program committees. 
This committee was also charged with the responsibility of making reeommenda- 
tions to the Board of Directors as to times and places for future meetings. 
Another innovation introduced during the past year was the appointment of 
assistant secretaries in connection with the meetings. S. B. Littauer acted as 
assistant secretary for the New York meeting, K. J. Arnold for the summer 
meeting in Madison, Z. W. Birnbaum for the Seattle meeting and W. R. Van 
Voorhis for the Cleveland meeting. The assistant secretaries were charged with 
the task of looking after the local arrangements that had to be made in connec- 
tion with the meetings. The appointment of assistant secretaries proved to 
be a great success not only in facilitating the necessary local arrangements for 
meetings but also in relieving the burden on the secretary’s office. On the basis 
of this year’s experience, it seems very desirable to continue with this practice 
in the future. 

No Rietz Memorial lecture was given in 1948 in accordance with a decision 
of the Board of Directors that these lectures should not be given every year. 
It is planned, however, to have a Rietz lecture for 1949 and the Board of Direc- 
tors invited J. Neyman to deliver it. 

The New Constitution. One of the major events of the year was the adoption 
of the new constitution at the meeting in Madison. The growth of the Institute 
in recent years made parts of the old constitution obsolete and the need for a re- 
vision was apparent. Our thanks are due to the Committee on Planning and 
Development which has devoted much time and consideration to the study of 








158 REPORT OF THE PRESIDENT 


the problem and prepared a draft of a revised constitution. M.H. Hansen was 
chairman of this Committee. Other members were: J. H. Curtiss, W. G. 
Cochran, W. Feller, J. Neyman, H. W. Norton, F. F. Stephan, J. W. Tukey, 
and W. A. Wallis. A draft of the new By-Laws was prepared by J. W. Tukey, 
who acted as a subcommittee of the Committee on Planning and Development. 

Annals. The growth of the Institute during the past few years has mani- 
fested itself also in a constantly increasing number of manuscripts submitted for 
publication in the Annals. While it is very gratifying to see this upward trend, 
it raises some problems of financial nature. At the rate manuscripts are com- 
ing in, an expansion of the publication facilities of the Institute would seem 
very desirable. Increase of the volume of the Annals would, however, mean 
increased cost and the present financial situation of the Institute could not 
allow such an additional burden unless some new sources of income can be found. 
Apart from a possible increase in the cost of printing the Annals, it seems that 
additional expenditures will be necessary for secretarial help in 1949. It was 
decided at the membership meeting in Madison that additional funds be raised 
through the contributions of universities and other organizations with strong 
interest in mathematical statistics and through the contributions of the members. 
Appeals for such contributions were sent out and it is hoped that there will be a 
generous response. 

The new constitution permits the appointment of responsible Associate Edi- 
tors. This brings up the whole question of editorial set-up and policies. A 
committee with 8S. S. Wilks as chairman was appointed to make a thorough study 
of the Institute’s publication experience and to make recommendations as to 
publication policies and editorial set-up. Other members of this committee are: 
W. G. Cochran, W. Feller, M. A. Girshick, J. Neyman, P. 8. Olmstead, W. A. 
Wallis and J. Wolfowitz. The committee gave much thought and considera- 
tion to the problems involved and will report to the newly elected officers and 
Council. 

The Annals has developed under the leadership of the Editor, 8S. S. Wilks, 
to one of the outstanding professional journals. Iam sure that I can speak for 
all our members in expressing the Institute’s indebtedness to 8S. 8. Wilks for his 
untiring and most successful work. 

Committees. The problem of classification of statisticians in the Government 
service is naturally of considerable importance to the statistical profession. A 
committee consisting of W. E. Deming (chairman) and C. Eisenhart was ap- 
pointed to make a thorough study of this question with a view to advising the 
Civil Service Commission. The committee prepared a report in which three 
main categories of statisticians in Government Service are distinguished: mathe- 
matical statisticians, statistical analysts and data-collecting statisticians. The 
report was transmitted to the Civil Service Commission with the approval of the 
Board of Directors. The members of this committee are to be commended for 
the excellent work they have done in spite of the severe limitation of time al- 
lotted by the Civil Service Commission. The work on the problem of classifica- 


eS © Oo 


REPORT OF THE PRESIDENT 159 


tion of statisticians still goes on and a committee of experts consisting of mem- 
bers of the Washington Statistical Society, the Institute of Mathematical Sta- 
tistics, and the American Statistical Association has been set up to advise the 
Civil Service Commission on this problem. Our representatives on this com- 
mittee of experts are: W. E. Deming, C. Eisenhart, M. H. Hansen and 8. Weiss. 

The advances in numerical computations in recent years has made an enlarge- 
ment and reorganization of the Committee on Tabulation necessary. Its present 
members are: R. L. Anderson, C. Eisexhart (Chairman), A. M. Mood, F. 
Mosteller, H. G. Romig, L. E. Simon, and J. W. Tukey. The objectives of this 
committee, as outlined by the chairman are: (1) to prepare a comprehensive 
list of new mathematical tables that would be of value in statistical theory and 
applications, (2) to assemble an American Collection of ‘‘Tables for Statisti- 
cians’, (3) to prepare a list of mathematical tables of importance in statistical 
theory and applications to be recommended for inclusion in the proposed Na- 
tional Bureau of Standards volume of ‘Tables for the Occasional Computer’. 
To implement the program of the committee, the following sub-committees have 
been constituted: (1) “On Computing Centers” with L. E. Simon as Chairman, 
(2) “On Ranks and Runs” with A. M. Mood as Chairman, (3) ‘On Serial Cor- 
relations’ with R. L. Anderson as Chairman, (4) “On 2 x 2 Tables” with C. 
Eisenhart as Chairman, (5) “On Order Statistics” with F. Mosteller as Chair- 
man, (6) ‘On Binomial, Poisson, and Hypergeometric Distributions” with 
H. G. Romig as Chairman, (7) “On Miscellaneous Tables” with J. W. Tukey 
as Chairman. 

On the recommendation of the membership committee, consisting of H. 
Scheffé (chairman), C. C. Craig, P. G. Hoel and F. F. Stephan, the following 
members have been elected as Fellows: J. Berkson, E. L. Lehmann, E. J. G. 
Pitman, H. E. Robbins and C. M. Stein. The members of the finance com- 
mittee for 1948 were P. S. Dwyer (chairman), C. F. Roos, L. A. Knowles and 
T. N. E. Greville. 

The Nominating Committee for 1948 consisted of W. Bartky (chairman), 
C. C. Craig, J. F. Daly, H. A. Freeman, E. L. Lehmann and W.G. Madow. The 
committee nominated J. Neyman for President, J. L. Dobb for President-Elect 
and 24 Council members for the 12 positions to be filled. In accordance with 
the provisions of the new constitution, the Nominating Committee for 1949 has 
also been appointed. The members of this Committee are: W. G. Cochran 
(Chairman), M. H. Hansen, H. B. Mann, A. M. Mood and H. G. Romig. 

The Board of Directors has been exploring the possibilities for a closer co- 
operation with our colleagues abroad and for making foreign statistical publica- 
tions more easily accessible to our members. In particular, there has been 
correspondance with Professor E. S. Pearson, Managing Editor of Biometrika, 
on the question of a possible reduction of the subscription rate of Biometrika 
for our members. As a result of these discussions, Professor Pearson offered 
certain reductions, provided that a sufficient number of subscribers can be se- 
cured. Detailed information on this was contained in a memorandum of the 








160 REPORT OF THE SECRETARY-TREASURER 


Secretary, P. S. Dwyer, in the November mailing to the membership. It is 
hoped that many of our members will make use of this opportunity. 

With the new constitutions of the American Statistical Association and the 
Institute of Mathematical Statistics adopted, the way is cleared for the considera- 
tion of possible federation plans of the various statistical organizations by the 
Inter-Society Committee on Federation. J. H. Curtiss and P. 8. Olmstead con- 
tinued to serve as our representatives on the aforementioned committee during 
1948. W. Feller was our representative on the Policy Committee for Mathe- 
matics, and F. C. Mosteller and 8. S. Wilks represented the Institute on the 
Joint Committee for the Development of Statistical Application in Engineering 
and Manufacturing. W. Bartky was reappointed for a three-year term as our 
representative to the Division of the Physical Sciences of the National Research 
Council, and H. Hotelling was our representative to the American Association 
for the Advancement of Science. 

In conclusion, I wish to thank all committee members and others who par- 
ticipated in the work of the Institute during the past year. The heaviest burden 
falls, of course, on the Secretary and it is hard to express adequately our ap- 
preciation for his unselfish efforts and devotion. The smooth and efficient con- 
duct of the affairs of the Institute is largely due to his work. 

ABRAHAM WALD 
President, 1948 
December 31, 1948 


(nM ee a amt 


REPORT OF THE SECRETARY-TREASURER OF THE INSTITUTE 
FOR 1948! 


At the beginning of 1948 the Institute had 1037 members and during the 
period covered by this report 126 new members (13 of whom begin their mem- 
bership with 1949) joined the Institute and two members were re-instated. 
During 1948 the Institute lost 64 members of which 24 were by resignation, 38 
by suspension for non-payment of dues and 2 by death. Judging from the 
information available at this date, the Institute will have 1101 members as it 
starts 1949. 

Deceased during the year were Dr. Otis A. Pope and H. M. Tompkins. 

Meetings of the Institute held’ during 1948 included those at Columbia Uni- 
versity on April 14-15, at the Berkeley campus of the University of California 
on June 22-24, at the University of Wisconsin on September 6-10, at the Uni- 
versity of Washington on November 26-27, and at Cleveland on December 
26-30. The Secretary wishes to call attention to the excellent work of the 
members who served as assistant secretaries at these meetings: Professor 
Littauer at New York, Professor Arnold at Madison, Professor Birnbaum at 
Seattle and Professor Van Voorhis at Cleveland. 


1 This report covers the period January 1, 1948 to December 20, 1948 as the books were 
closed on December 20, 1948 so that the report could be made at the annual meeting. 


i eal ll Sia, alates 


Mw Ve 


ee | 


REPORT OF THE SECRETARY-TREASURER 


A summary of the financial transactions of 
the Financial Statement for 1948 which follows: 


FINANCIAL STATEMENT 
December 31, 1947 to December 20, 1948 
A. RECEIPTS 


Balance on Hand,? December 31, 1947... 
Ma oo irico cues 
Contributions... . 
Subscriptions. . 5 
Sale of Back Numbers... 
Income from Investments 
er 
Miscellaneous...... 


ee 


B. EXPENDITURES 


Annals—Current 
Office of the Editor.............. 
Waverly Press. ........... 


Annals—Back Numbers 

Reprinted Vol. XI#2 & #3; XII *2 & #3; XIV #4 
Mathematical Reviews and Inter-Society Committee 
Office of the Seeretary-Treasurer 


Printing, memoranda, ete. (including some stamped enveloped) ... 
Postage, supplies, express, telephone calls. ....................... 


A PNR 55 ooo 0G. ae ra ee ME Riaiola ws ashe neat s 


PEMD aor 2 Cintas pak pba pee a det oh caer cans ge Ne Gea iran Deh si 
Balance on Viand,”” DGcemDer ZO, VOI soca ok eseeiiieccicisew seca seaewee's 


N52 as era ca is cectie gi She we wicilonttg coat ute paeisl Ave baste Wk OO Mee SIS 


C. SUMMARY OF RECEIPTS AND EXPENDITURES 


Balance on Hand,** December 31, 1947 


Expenditures during 1948 
Balance on Hand,** December 20, 1948 


** In bank deposits and government bonds. 


D. LIFE MEMBERSHIP FUNDS 


the Institute is 


MCE RENIN io Ors ith sn LIRR EER WRG RNA Aung eka SERS 


Pekan $20, 291.99 


pe ea nN ihc 1 coe ich sh ernie Decline ence nlens sis MUA Ba Sen RE RNORTe Siae wes 


161 


given in 


$5, 858.37 
7,482.21 
255.50 
3,660.40 
2,718.27 
100.00 
160.00 
57.24 


$20, 291.99 


$175.00 


7,824.66 $7,999.66 





1,968.50 


ee: 225.00 
1,174.52 
225.00 
1,468.00 

30.48 2,808.00 





79.82 
7,121.01 


Bee . $5,858.37 


14, 433.62 
13,170.98 


-ooeeeee. 7,121.01 


It has been the practice to place all life membership payments in a special fund (most of 
which is in government bonds) and to hold all these funds in reserve until the death of the 


member—after which his payment is released to the general fund. 


2 In bank deposits and government bonds. 


There were no new life 








162 REPORT OF THE SECRETARY-TREASURER 


membership payments in 1948. During the year a transfer to the general fund has been 
made of the life membership payment of Professor Irving Fisher, who died in 1947. 


December December 
31, 1947 20, 1948 
Number of Life Members. .........65c ccc cceccews ce aeae cece’ 30 29 
Oe | a $1,888.00 $1,888.00 
cia. ohn Cheeses cael enedhewaaeeeeeuntns 427 .00 392.00 
PUR OMLRME es Dds crete ese Pinte od Aaa akaec tis VC Ta ue Vaan OA $2,315.00 2,280.00 


E. BACK ISSUES FUND 


It has been our policy, since January 1, 1948, to use income from the sale of back issues 
to finance the additional reprinting of back issues. 


Income from the Sale of Back Issues during 1948. ...........0....... 0.00000 eee $2,718.27 
Expense for Reprinting Back Issues in 1946.........5....00.00006000000ccc0d oecnees 1,968.50 
Balance in the Fund, December 20, 104S8..... .... 0.5 occ cence ceeccccessceces $749.77 


At present 500 copies of Volume 13 #1 and #2 are being reprinted at a cost of $735.00. 
The payment of this in January will leave a small balance in the fund. 


F. COMPARISON OF ASSETS ON DECEMBER 31, 1947 AND DECEMBER 20, 1948 


1947 1948 
U.S. Government G BOGS. .....ikciccicccscsaciessucceces $3,000 .00 $3, 000 .00 
Dare MCMBErSNID MiGRGES . .. 6665.6 65a dca ac eeewsesaes sales 2,315.00 2,280.00 
re ais MAINLINE 2a. vep 7h sa ceaebsies orahie. arbi d msasew nea eat eile — 749.77 
MGGiOoa! Bank DeOWOsits oe iii cs code osiecea cesses ssawses 543.47 1,091.24 
Current Accounts Receivable...................cccecccecs 423 .55 291.22 
Estimated Value (Cost) of Back Annals?.................. 10,866.73 12,785.61 
tl ee les dense Sts oan cht icc en has had Nona boes covet NG. avon ete hs ts S17, 148.65... 0.06 $20,197.84 
EN ROU Sin ci oe Sit pis a ea ber erat Guus saiand asia Mal ine ae Mie Stn Malone lam 3,049.19 


G. LIABILITIES OF INSTITUTE OF MATHEMATICAL STATISTICS AS OF DECEMBER 20, 1948 


All bills which have been presented have been paid. The Life Membership Fund now 
contains $2,280.00 which covers 29 members. Also, $4,060.50 has been paid in for con- 
tributions and 1949 dues and subscriptions. 


This report does not cover the amount of $13.95 which is held by the Institute 
for the fund for Annals for Countries Devastated by the War. (This fund has 
been under the supervision of Professor Neyman.) During the year this fund 
purchased $376.25 in back issues (at the agreed rate of $4.50 per volume) which 
has contributed to the total sales in back issues. 

There has been little change in the life membership fund during the year. 
Our practice of making no transfer of life membership funds until the death 
of the member is most conservative and protects the interests of the life member. 

The question of the value of our inventory is always difficult. We now have 
19,083 issues of the Annals. At 67¢ per copy, it appears that $12,785.61 is a 
fair estimate of their actual cost. This is in fact less than 5 times the actual 


3 Cost of Annals calculated at 67 cents per copy. 


in 
es 


7d 


y' 
ti 


m=! h6!OOOUCUCrT lhc | lUrVrehCUr!?. CUS 


an 


coi oe 


~y @ 


—=—— = oe 





REPORT OF THE EDITOR 163 


income from back issues this year and hence seems to be a very conservative 
estimate of the marketable (within ten years) value of our present inventory. 

We are in a position now to continue to supply all issues beginning with volume 
7 and expect that the sales in back volumes will be such that within two or three 
years we will be able to reprint the 9 issues in volumes 1-6 which are now prac- 
tically or completely exhausted. 

It appears that the increase in dues and subscriptions has been adequate to 
take care of the increased expense during 1948. No bonds have been cashed 
during the year. Additional funds appear necessary for 1949, however, since 
the present amount of clerical help in the office of the Secretary-Treasurer is 
utterly inadequate. The employment of additional secretarial assistance, which 
the Institute must have, will increase the total expense of this office by about 
$1,200.00. It is necessary, too, to provide a cushion for a possible increase in 
our Waverly bill, which is up about 10% in 1948. It appears that we may 
need from $1,500.00 to $2,000.00 additional funds for 1949. Available sources 
are increases in the number of members and subscribers, contributions from 
our members, and institutional contributions and memberships. 


Pau. 8S. Dwyer 
Secretary- Treasurer 
December 21, 1948 


(RR eae 


REPORT OF THE EDITOR FOR 1948 


During 1948 the rate of submission of manuscripts for publication in the 
Annals has continued to increase. The size of the Annals was held approxi- 
mately to that set for 1947, the number of pages printed in 1948 being 610. 
The 1948 volume of the Annals contained 59 papers, of which 24 were short 
notes. 

During the past year the backlog of papers has increased to nearly two issues. 
Thus manuscripts submitted now, especially the longer ones, must wait at least 
six months after being refereed in order to be printed. If the rate at which 
manuscripts are submitted increases, as it has during the last two years, this 
waiting gap may increase to a year by the end of 1949. 

If additional funds could be found, it would be highly desirable to increase the 
Annals to 700 pages in 1949. 

The manuscripts being received continue to cover a rather wide range of 
topics in probability and statistics. Almost all of them are research papers. 
In the Editor’s opinion it would be highly desirable for the Institute to take steps, 
perhaps through invited addresses, to secure good expository and review articles. 
Sustained attempts have been made over a period of years to obtain such articles 
by invitation, but with little success. 

The Editor wishes to take this opportunity to acknowledge, on behalf of the 
Editorial Committee, the generous refereeing assistance which has been given by 





164 REPORT OF THE EDITOR 


the following persons: Z. W. Birnbaum, A. H. Bowker, I. W. Burr, G. W. Brown, 
K. L. Chung, W. J. Dixon, A. Dvoretzsky, T. N. E. Greville, F. E. Grubbs, 
M. H. Hansen, T. E. Harris, C. Hastings, H. B. Horton, G. A. Hunt, B. F, 
Kimball, T. Koopmans, H. Levene, M. 8S. MacPhail, P. J. McCarthy, R. B. 
Murphy, M. P. Peisakoff, P.S. Olmstead, E. Paulson, H. G. Romig, L. J. Savage, 
F. F. Stephan, D. F. Votaw and J. E. Walsh. 

The Editor owes special acknowledgment to Mr. M. E. Freeman for prepara- 
tion of manuscripts and to Mrs. Frances M. Purvis for other editorial and office 
assistance. 

S. S. WILKs 
Editor 


December 31, 1948. 





aT 


